PCX 

INTERNATIONAL APPLICATlOrM PUBLISHED UNDER THE PATENT CcTDPERATION TREATY (PCT) 



..D INTCLLECTUAL PROPERTY ORGANIZATIC; 

Iniemaiionai Bureau <^ 




(51) International Patent Classification ^ : 

C12N 15/82, 15/29, 15/53, AOIH 5/00, 
5/10 



A I 



(11) International Publication Number: WO 98/45461 

(43) International Publication Date: 15 October 1998 (15.10.98) 



(21) International Application Number: PCT/US98/07 1 79 

(22) International Filing Date: 9 April 1998 (09.04.98) 



(30) Priority Data: 

08/831,575 9 April 1997 (09.04.97) US 



(71) Applicant (for all designated States except US): 

RHONE^POULENC AGRO [FK/FR]; D6pt. Pro- 

pri6t6 Industrielle, 14-20, me Picire Baizet, F-69009 Lyon 
(FR). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): THOMAS, Terry. L. 
[US/US]; 2804 Cloister Drive, College Station, TX 77845 
(US). LI, Zhongsen [CN/US]; Apartment Z-l-H, 1 Hense!, 
College Station, TX 77840 (US). 

(74) Agents: DiGIGLIO, Frank, S. et al.; Scully, Scott, Murphy & 
Presser, 400 Garden City Plaza, Garden City. NY 11530 
(US). 



(81) Designated Stiitcs: AL. AM. AT, AU, AZ, BA, BB, BG, BR, 
BY, CA. CH, CN, CU. CZ. DE, DK, EE. ES. Fl. GB, GE. 
GH, GM, GW, HU, ID. IL, IS. JP, KE, KG, KP, KR, KZ, 
LC. LK. LR, LS. LT, LU. LV, MD. MG. .MK, MN, MW, 
MX, NO. NZ, PL, PT. RO, RU. SD, SE, SG. SI, SK, SL, 
TJ, TM. TR. TT. UA, UG, US, UZ, VN. YU. Z\V, ARIPO 
patent (GH, GM, KE, LS, MAV, SD, SZ, UG, ZW), Eurasian 
patent (A.M. AZ, BY, KG, KZ, MD, RU, TJ, TM), European 
patent (AT, BE, CH, CY, DE, DK. ES, Fl. FR, GB. GR, 
IE, IT, LU. MC, NL, PT, SE). OAPI patent (BP, BJ, CF, 
CG. CI, CM, GA, GN, ML, MR. NE, SN, TD, TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



I 

(54) Title: AN OLEOSIN 5' REGULATORY REGION FOR THE MODIFICATION OF PLANT SEED LIPID COMPOSITION j 

I 

(57) Abstract 

The present invention is directed to 5' regulatory regions of an Arabidopsis oleosin gene, TIjC 5' regulatory regions, when operably 
linked to either the coding sequence of a heterologous gene or a sequence complementar\' to a native plant gene, direct expression of the j 
coding sequence or coniplementiiiy sequence in a plant seed. The regulator>' regions are useful in expression cassettes and expression 
vectors for the transformation of plants. Also provided are methods of modulating the levels of a heterologous gene such as a fatty acid 
synthesis or lipid metabolism gene by transforming a plant with the subject expression cassettes and expression vectors. 
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AN OLEOSIN 5' REGULATORY REGION FOR THE 
MODIFICATION OF PLANT SEED LIPID COMPOSITION 

BACKGROUND OF THE INVENTION 

Seed oil content has tradi c iona 1 ly been 
modified by plant breeding. The use of recombinant 
DNA technology to alter seed oil composition can 
accelerate this process and in some cases alter seed 
oils in a way that cannot be accomplished by breeding 
alone. The oil composition of Brassica has been 
significantly altered by modifying the expression of a 
number of lipid metabolism genes. Such manipulations 
of seed oil composition have focused on altering the 
proportion of endogenous component fatty acids. For 
example, antisense repression of the A12 - desaturase 
gene in transgenic rapeseed has resulted in an 
increase in oleic acid of up to 83%. Topfer et al. 
1995 Science 25S 6 81 - 6 8 6 . 

There have been som.e successful attempts at 
modifying the composition of seed oil in transgenic 
plants by introducing nev; genes chat allov/ the 
production of a fatty acid that the host plants were' 
not previously capable of synthesizing. Van de Loo , • 
et al . (1995 Proc , Natl, Acad. Sci USA 92:674 3 - 6747) 
have been able co introduce a Al 2 - hydroxylase gene 
into transgenic tobacco, resulting in the introduction 
of a novel facty acid, ricinoleic acid, into its seed 
oil. The reported accumulation was m.odest from plants 
carrying constructs in which transcription of the 
hydroxylase gene was under the control of the 
cauliflower mosaic virus (CaMV) 358 promoter. 
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Similarly, tobacco plants have been engineered to 
produce low levels of petroselinic acid by expression 
of an acyl-ACP desaturase from coriander (Cahoon et 
al . 1992 Proc. Natl. Acad. Sci USA 59:11184-11188). 

The long chain fatty acids (CIS and larger) , 
have significant economic value both as nutritionally 
and medically important foods and as industrial 
conmodities (Ohl rogge , J . 3 . 1994 Plant Physiol . 
104 :S21-Q26) . Linoleic (18:2 A9,12) and a-linolenic 
acid (18:3 A9,12,15) are essential fatty acids found 
in many seed oils. The levels of these fatty-acids 
have been manipulated in oil seed crops through 
breeding and biotechnology (Ohlrogge, et al . 1991 
Biochim. Biophys. Acta 1082:1-26: Topfer et al . 1995 
Science 268:681-686). Additionally, the production of 
novel fatty acids in seed oils can be of considerable 
use in both human health and induscriai applications. 

Consumption of plant oils rich in y- 
linolenic acid (GLA) (18:3 A6,9,12) is thought to 
alleviate hypercholesterolemia and other related 
clinical disorders which correlate wich susceptibility 
to coronary heart disease (Brenner R.R. 1976 Adv , Exp. 
Med. Biol. 53:85-101). The therapeutic benefits of 
dietary GLA may result from its role as a precursor to 
prostaglandin synthesis (Weete, J.D. 1980 in Lipid 
Biochemistry^ of Fungi and Other Organisms , eds . Plenum 
Press, Meu' York, pp , 59-62). Linoleic acid(13:2) (LA) 
is transformed into garruTia linolenic acid (18:3) (GLA) 
by the enzyme A6 - desaturase . 
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Fev; seed oils contain GLA despite high 
contents of the precursor linoleic acid. This is due 
to the absence of A6 - desa turase activicy in most 
plants. ror example, only borage {Borago 
officinalis) , evening primrose (Oenothera biennis) , 
and currancs (Ribes nigrum) produce appreciable 
amounts of linolenic acid. Of these three species, 
only Oenothera and Borage are cultivated as a 
commercial source for GLA. It v^ould be beneficial if 
agronomic seed oils could be engineered to produce GLA 
in significant quantities by introducing a 
heterologous A6 - desa turase gene. It would also be 
beneficial if other expression products associated 
with fatty acid synthesis and lipid mecabolism could 
be produced in plants at high enough levels so that 
commercial production of a parr^icular expression 
product becomes feasible. 

As disclosed in U.S. Patent , 5,552,306, a 
cyanobact:er ial A" - desa turase gene has been recently 
isolated. Expression of this cyanobac cer ial gene in 
transgenic tobacco resulted in significant but lov; 
level GLA accumulation, (Reddy' et al . 1996 Nature 
Biotech. 1 4 : 639 - 64.2) , Applicant's copending U.S. 
Application Serial No. 08, 366, -779, discloses a Ab - 
desaturase gene isolated from the plant Borago 
officinalis and its expression in tobacco under the 
control of the CaMV 3 53 promoter. Such expression 
resulted in significant but low level GLA and 
oc.tadeca tetraenoic acid (ODTA or OTA) accum.ulat ion in 
seeds. Thus, a need exists for a promoter- which 
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functions in plants and v;hich consistently directs 
high level expression of lipid metabolism genes in 
"ransgenic plant seeds. 

Oleosins are abundant seed proteins 
associated -.vith the phospholipid monolayer membrane of 
oil bodies. The first oleosin gene, L3 , v/as cloned 
from maize by selecting clones v/hose in vi tro 
translated products v;ere recognized by an anti-L3 
antibody (Vance et al . 1987 J. Biol. Chem. 262:11275- 
11279) . Subsequently, different isoforms of oleosin 
genes from such different species as Brassica , 
soybean, carrot, pine, and Arahidopsis have been 
cloned (Huang, A.H.C., 1992, Ann. Rei^iews Plant Phys , 
and Plant Mol , Biol, ^^3: 177 -200; Kirik et al . , 1996 
Plant Mol , Biol, 31:413-417; Van Rooijen et al . , 1992 
Plant Mol, Biol, 18:1111 -1179 ; Zou et al . , Plant Mol. 
Biol. 31:429-433. Oleosin protein sequences predicted 
from these genes are highly conserved, especially for 
zhe central hydrophobic domain. All of these oleosins 
have the characteriscic feature of three distinctive 
domains. An amphipathic domain of 40-60 amino acids 
is present at the N- terminus; a totally hydrophobic 
domain of 68,-74 amino acids is located ac the center; 
and an amphipathic cy- helical domain of 33 - 40 amino 
acids is situated at the C- terminus (Huang, A.H.C. 
1992) . 

The present invention provides 5' regulatory- 
sequences from, an oleosin gene which direct high level 
expression of lipid metabolism genes in transgenic 
plaints. In accordance ivith the present, invention. 
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chimeric constructs comprising an oleosin 5' 
regulatory region operably linked to coding sequence 
for a lipid metabolism gene such as a A6 - desa turase 
gene are provided. Transgenic plants comprising the 
subject chimeric constructs produce levels of GLA 
approaching the level found in those few plant species 
which naturally produce GLA such as evening primrose 
{Oenothera bienni s) . 

SUMMARY OF THE INVENTION 

The present invention is directed to 5' 
regulatory regions of an Arabldcpsis oleosin gene. 
The 5' regulatory regions, v;hen operably linked to 
either the coding sequence of a heterologous gene or 
sequence complementary to a native plant gene, direct 
expression of the heterologous gene or complementary 
sequence in a plant seed. 

The present invention thus provides 
expression cassettes and expression vectors com.prising 
an oleosin 5' regulatory region operably linked zo a 
heterologous gene or a sequence complementary to a 
native, plant gene. 

Plane transf orm.at ion vectors com.prising the 
expression cassettes and expression vectors are also 
provided as are plant cells transformed by these 
vectors, and plants and their progeny containing the 
vec tors . 

In one embodiment of the invention, the 
heterologous gene or com.pl ementary gene sequence is a 
fatty acid synthesis gene or a lipid metabolism, gene. 
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In another aspect: of the present invention, 
a method is provided for producing a plant v/ith 
increased levels of a product of a fatty acid 
synthesis or lipid metabolism gene. 

In particular, there is provided a method 
for producing a plant with increased levels of a fatty 
acid synthesis or lipid metabolism gene by 
transforming a plant v;ith the subject expression 
cassettes and expression vectors which comprise an 
oleosin 5' regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene. 

In another aspect of che presenc invention, 
there is provided a method for cosuppress ing a native 
fatty acid synthesis or lipid metabolism gene by 
transf orm.ing a plant with the subject expression 
cassettes and expression vectors which com.prise an 
oleosin 5' regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene. 

A further aspecu of this invention provides 
a mechod of decreasing production of a native plant 
gene such as a fatty acid synthesis gene or a lipid 
metabolism gene by transf orm.ing a planz with an 
expression vector comprising a oleosin 5' regulatory 
region operably linked to a nucleic acid sequence 
complementary to a native plant gene. 

Also provided are methods of m.odulating the 
levels of a heterologous gene such as a fatty acid 
synthesis or lipid metabolism, gene by transforming a 
plant with the subject expression cassettes and 
expression vectors . 



BNSOOCIO: <WO_9846461A1 J_> 



wo 98/4546 1 ^ _ . l>CT/US98/07 179 

- 7 - 



BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 depicts the nucleotide and 
corresponding amino acid sequence of the borage A6 - 
desaturase gene (SEQ ID NO : 1) . The cytochrome b5 
heme-binding mocif is boxed and the putative metal 
binding, histidine rich motifs (HRMs) are underlined. 
The motifs recognized by the primers (PGR analysis) 
are underlined with dotced lines, i.e. tgg aaa tgg aac 
cat aa; and gag cat cat ttg ttt cc. 

Fig. 2 is a dendrogram showing similarity of 
the borage A6 - desa turase to other membrane - bound 
desaturases. The amino acid sequence of the borage A6 - 
desaturase was compared to other known desaturases 
using Gene Works ( In tel 1 iGene t i cs ) . Numerical values 
correlate to relative phylogenetic distances between 
subgroups compared . 

Fig. 3A provides a gas liquid chrom.a tography 
profile of che fatty acid methyl esters (FAl^'ISS) 
derived from leaf tissue of a wild type tobacco 
' Xanthi ' . 

Fig. 3B provides a gas liquid chromatography 
profile of the FAMES derived from leaf tissue of a 
tobacco plant transform.ed with the borage Ab - 
desaturase cDNA under transcriptional control of the 
CaMV 3 58 promoter {pAN2) . Peaks corresponding co 
m.ethyl linoleate (18:2), methyl y-linolenate (18:3y). 
m.ethyl a - 1 inolena te (18:3ct). and m.ethyl 
octadeca tecraenoa te (18:4) are indicated. 
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Fig. 4 is the nucleotide sequence and 
corresponding amino acid sequence of the oleosin AtS21 
CDM7\ (SEQ ID NO : 3 ) . 

Fig. 5 is an acidic-base map of the 
predicted AtS21 protein generated by DMA Strider 1.2. 

Fig. 6 is a Ky te - Dool i t t le plot of the 
predicted AtS21 protein generated by DNA Strider 1.2. 

Fig. 7 is a sequence alignment of oleosins 
isolated from Arahidopsis, Oleosin sequences 
published or deposited in SMBL, BCM, MCBI databases 
were aligned to each other using GeneWorks® 2.3. 
Identical residues are boxed Vv^ith rectangles. The 
seven sequences fall inzc three groups. The first 
group includes AeS21 (SEQ ID NO : 5 ) , X91918 (SEQ ID 
NO:d) and Z29859 (SEQ ID NO : 7 ) . The second group 
includes X62352 (SEQ ID NO : 8 ) and Atol3 (SEQ ID NO : 9 ) . 
The third group includes X91956 (SEQ ID NO: 10) and 
L-109'54 (SEQ ID NOrll). Differences in amino acid 
residues v/ichin the same group are indicated by 
shadov/s. Aro2/Z54164 is identical to A-tS21. Atol3 
sequence (Accession No. Z541654 in EMBL database) is 
actually no:: disclosed in the EMBL da:iabase. The 
Z54165 Accession number designates the same sequence 
as Z54164 v/nich is Atol2 . 

Fig. 8A is a Northern analysis of the AtS21 
' ^gene . An RNA gel blot containing ten m.icrograms of. 
total RNA excracted from Arahidopsis flov/ers '(F), 
leaves (L) , roots (R) , developing seeds (Se), and 
developing silique coats (Si) v/as hybridized *with a 
probe made from the full-length AtS21 cDN.A. 
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Fig. 83 is a Southern analysis of the AtS21 
gene. A DNA gel blot containing ten micrograms of 
genomic DMA digested v/ith BamHI (B), EcoRI (E) , 
Hindlll (H) , Sad (S) , and Xbal (X) v/as hybridized 
with a probe made from, the full length AtS21 cDMA. 

Fig, 9 is the nucleotide sequence of the 
Sacl fragment of AtS21 genomic DNA (SEQ ID MO:12). 
The promoter and intron sequences are in uppercase. 
The fragments corresponding to AtS21 cDNA sequence are 
in lower case. The first ATG codon and a putative 
TATA box are shadowed. The sequence com.plemen tary to 
21P primer for PGR amtplif ication is boxed. A putative 
abscisic acid response elemienc (ABRE) and lwo 14 bp 
repeats are underlined. 

Fig. 10 is a m.ap of AtS21 prom.ot er /GUS 
construct (pAN5) . 

Fig. IIA depiccs AtS21/GUS gene expression 
in Arabidopsis bolt and leaves. 

Fig. 113 depicts AtS21 GUS gene expression 
in Arabidopsis siliques. 

Fig. lie depicts AtS21 GUS gene expression 
in Arabidopsis developing seeds. 

Figs. IID through IIJ depict AtS21 GUS gene 
expression in Arabidopsis developing em.bryos . 

Fig. IIK depicts AtS21/GUS gene expression 
in Arabidopsis root and root hairs of a young 
seedl ing . 

Fig. IIL depiccs AtS21/GUS gene expression 
in Arabidopsis cotyledons and the shoot apex of a five 
da\^ seedl ing . 
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Figs. IIM and IIM depict At:S21/GUS gene 
expression in Arabidopsis cotyledons and the shoot 
apex of 5-15 day seedlings. 

Fig. 12A depicts AtS21/GUS gene expression 
in tobacco embryos and endosperm. 

Fig, 123 depicts AtS21/GUS gene expression 
in germinating tobacco seeds . 

Fig. 12C depicts AtS21/GUS gene expression 
in a 5 day old tobacco seedling. 

Fig. 12D depicts AtS21/GUS gene expression 
in 5-15 day old tobacco seedlings. 

Fig. 13A is a Northern analysis shov/ing 
AtS21 mRNA levels in developing v/ild-type Arabidopsis 
seedlings. Lane 1 v/as loaded with RMA from, developing 
seeds, lane 2 v/as loaded v/ith RNA from seeds i.mbibed 
for 24-48 hours, lane 3: 3 day seedlings; lane 4: 4 
day seedlings; lane 5: 5 day seedlings; lane 6: 6 day 
seedlings; lane 7; 9 day seedlings; lane 8: 12 day 
seedlings. Probe was labeled AtS21 cDNA . Exposure 
v;as for one hour at -SC^C. 

Fig. 13B is the sam:e blot as Fig. 13A only 
exposure was for 24 hours at -80''C. 

Fig. 13C is the same blot depicted in Figs. 
13A. and 13B after stripping and hybridization with an 
Arabidopsis tubulin gene probe. The sm.all band in 
each of lanes 1 and 2 is the remnant of the previous 
AtS21 probe. Exposure w^as for 48 hours at -80"C. 

Fig. 14 is a graph comparing GUS ac::ivities 
expressed by the AtS21 and 35S promoters. GUS 
activities expressed by the AtS21 promoter in 
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developing Arabidopsis seeds and leaf are plotted side 
by side with those expressed by the 35S promoter. The 
GUS activities expressed by the AtS21 promoter in 
tobacco dry seed and leaf are plotted on the right 
side of the figure. GUS activity in tobacco leaf is 
so lov; that no column appears. "G-H" denotes globular 
to heart stage; "H-T" denotes heart to torpedo stage; 
"T-C" denotes torpedo to cotyledon stage; "Early C" 
denotes early cotyledon; "Late C" denotes late 
cotyledon. The standard deviations are listed in 
Table 2. 

Fig. ISA is an RMA gel blot analysis carried 
out on 5 ug samples of RNA isolated from borage leaf, 
root, and 12 dpp embryo tissue, using labeled borage 
A6 - desa turase cDNA as a hybridization probe. 

Fig. 15B depicts a graph corresponding to 
the Northern analysis resulcs for the experiment shown 
in Fig . 15A. 

Fig. 16A is a graph showing relative legumin 
RNA accumulation in developing borage embryos based on 
results of Northern blot. 

Fig. 16B is a graph showing relative 
oleosin RNA accumulation in developing borage embryos 
based on resulcs of Northern blot. 

Fig. 16C is a graph showing relative A6- 
desaturase RNA accumulation in developing borage 
embryos based on results of Northern bloc. 

Fig. 17 is a PGR analysis showing the 
presence of the borage delta 6-desaturase gene in 
transformed plants of oilseed rape. Lanes 1, 3 and 4 
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were loaded with PGR reactions performed v;ith DNA from 
plants transformed with the borage delta 6-desaturase 
gene linked to the oleosin 5' regulatory region; lane 
2: DNA from plant transformed with the borage delta 6- 
desaturase gene linked to the albumin 5' regulatory 
region; lanes 5 and 6: DNA from non - trans formed 
plants; lane 7: molecular v;eight m.arker (1 kb ladder, 
Gibco BRL) ; lane 8: PGR without added template DMA; 
lane 9: control with DNA from Agrohacterium 
tumefaciens EHA 105 containing the plasmid pAN3 (i.e. 
the borage del ta6 - desa turase gene linked to the 
oleosin 5' regulatory region) . 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides isolated 
nucleic acids encoding 5' regulatory regions from, an 
Arabidopsis oleosin gene. In accordance v/ith the 
present invention, the subject 5' reculacory regions, 
when operably linked to eicher a coding sequence of a 
heterologous gene or a sequence comtplem.en tary to a 
native plant gene, direct expression of the coding 
sequence or complem.en tary sequence in a plant seed. 
The oleosin 5' regulatory regions of the present 
invention are useful in the construction of an 
expression cassette which comprises in the 5 ' to 3 ' 
direction, a subject oleosin 5' regulatory region, a 
heterologous gene or sequence complementary to a 
native plant gene under control of the regulatory 
region and a 3' termination sequence. Such an 
expression cassette can be incorporated into a variety 
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sequences have been published in Kirik ec al . 1986 
Plant Mol, Biol, 31:413-417; Zou ec al . Plant Mol , 
Biol. 31:429-433; Van Rooigen et al . 1992 Plant Mol, 
Biol . 13: 1177 - 1179 . 

Virtual subtraction screening of a tissue 
specific library using a random primed polyip.erase 
chain (RP-PCR) cDhU\ probe is another method of 
obtaining an oleosin cDNA useful for screening a plant 
genomic DMA library. Virtual subtraction screening 
refers to a method where a cDNA library is constructed 
from a target tissue and displayed at a lovj density so 
that individual cDNA clones can be easily separated. 
These cDNA clones are subtract ively screened v/ith 
driver quantities (i.e., concentrations of DNA to 
kinetically drive the hybridization reaction) of cDNA 
probes made from, tissue or tissues ocher than the 
targec tissue (i.e. driver tissue) . The hybridized 
plaques represent genes that are expressed in both the 
target and the driver tissues; the unhybridized 
plaques represent genes that may be target tissue - 
specific or low abundant genes that can not be 
detected by the driver cDNA probe. The unhybridized 
cDNAs are selected as putative target tissue - specif ic 
genes and further analyzed by. one-pass sequencing and 
Northern hybridization . 

Random primed ?CR (RP-PCR) involves 
synthesis of large quantities of cDMA probes from a 
trace amount of cDNA. template. The method comibines 
the amplification pov;er of PGR v/ith the representation 
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of random priming to simultaneously amplify and label 
double - s tranded cDNA in a single tube reaction. 

iMethods considered useful ixn obtaining 
oleosin genomic recombinant DNA are provided in 
Sambrook et al , 1989, in Molecular Cloning: A 
Laboratory Manual, Cold Spring Marbor, NY, for 
example, or any of the myriad of laboratory manuals on 
recombinant DNA technology that are widely available. 
To determine nucleotide sequences, a multitude of 
techniques are available and known to the ordinarily 
skilled artisan. For example, restriction fragments 
containing an oleosin regulatory region can be 
subcloned into the polylinker site of a sequencing 
vector such as pBluescript (Stratagene) . These 
pBluescript subclones can then be sequenced by the 
double - stranded dideoxy method (Chen and Seeburg, 
19 8 5, DNA <I :165) . 

In a preferred emtbodiment, the oleosin 
regulatory region comprises nucleotides 1-1257 of Fig. 
9 (SEQ ID M0:12). Modifications to che oleosin 
regulatory region as set forth in SEQ ID MO: 12 which 
m>aintain the characteristic property of directing 
seed - speci f ic expression, are v^ithin che scope of the 
present invention. Such modifications include 
insertions, deletions and substitutions of one or m.ore 
n^icleotides . 

The 5 ' regulatory region of che present 
invention can be derived from restriction endonuclease 
or exonuclease digestion of an oleosin genomic clone. 
Thus, for example, the known nucleot ide or am.ino acid 
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sequence of the coding region of an isolaced oleosin 
gene (e.g. Fig. 7) is aligned to the nucleic acid or 
deduced amino acid sequence of an isolated oleosin 
genomic clone and 5' flanking sequence (i.e., sequence 
upstream from the t ransla t ional start: codon of the 
coding region) of the isolated oleosin genomic clone 
located . 

The oleosin 5' regulatory region as set 
forth in SEQ ID N0:12 (nucleotides 1-1267 of Fig. 9) 
may be generated from a genomic clone having either or 
both excess 5' flanking sequence or coding sequence by 
exonuclease Ill-mediated deletion. This is 
accomplished by digesting appropriately prepared DNA 
with exonuclease III (exoIII) and rem.oving aliquots at 
increasing intervals of tim.e during the digestion. 
The resulting successively smaller fragments of DMA 
may be sequenced to determ.ine the exact endpoinc of 
the 'deletions. There are several commercially 
available systems which use exonucleas'e III (exoIII) 
to create such a deletion series, e.g. Promega 
Biotech, "Erase - .A - Base" system.. Alternatively, PGR 
primers can be defined to allow direct: am.pl i f icat ion 
of the subject 5' regulatory regions. 

Using the same me tho'dolog ies , the 
ordinarily skilled artisan can genera-e one or more 
deletion fragments of nucleotides 1-1267 as set forth 
in SEQ ID MO: 12. Any and all deletion fragments v;hich 
comtprise a contiguous portion of nucleocides set fomh 
in SEQ ID MO: 12 and which retain the caoacitv zo 
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riirec'c seed - speci f ic expression are contemplated by 
the present invention. 

The identification ol: oleosin 5' regulalioiry 
sequences v/hicli direct seed - speci E i c expression 
comprising nucleotides 1-1267 on SEQ ID NO: 12 and 
modifications or deletion fragments thereof, can be 
accomplished by transcriptional fusions of specific 
sequences v;ith the coding sequences of a heterologous 
gene, transfer of the chimeric gene into an 
appropriate host, and detection of the expression of 
the heterologous gene. The assay used to detect 
expression depends upon the nature of the heterologous 
sequence. For example, reporter genes, exemplified by 
chloramphenicol acetyl transferase and [^-glucuronidase 
(GUS), are commonly used to assess transcriptional and 
translational competence of chimeric constructions. 
Rr.ancard assays are available co sen5ir.ive]y detect 
che reporter enzxmno in a transgenic organism.. The [• > - 
glucuron j.dase (GUS) gene is useful as a reporter of 
prom.o.ter activity in transgenic plants because of the 
iiigh stability of che enzyme in plane cells, che lack 
of intrinsic i-^ - glucuronidase activity in higher plants 
and ava i 1 abi ]. i ty of a quantitacive f iuor im.ecr ic assay 
and a hi s tochemical localization technique. Jefferson 
et al. (1987 EMBO J o:3901) have established standard 
procedures for biochemd.cal and hi s tocnem.ical detection 
oi: GUS activity in plant tissues. Biochemical assays 
are perform.ed by mixing plant cissue lysates v;ith 4 - 
m.e thylumbel 1 i f eryl - R - D - glucuronide , a f luo rime trie 
substrate for GUS, incubating one h.our at 37 ^C, and 
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Chen measuring the fluorescence of the resulting 4- 



GUS activity is determined by incubacing plant tissue 
sam.ples in 5 -bromo-4 -chloro- 3 - indolyl -glucuronide (X- 
Gluc) for about IB hours at ll^'C and observing the 
staining pattern of X-Gluc. The construction of sucli 
chim.eric genes allows definition of specific 
regulatory sequences and demonstrates that these 
sequences can direct expression of heterologous genes 
in a seed - speci f ic manner. 

Another aspect of the invention is di.rected 
to expression cassettes and expression vectors (also 
termed herein "chimeric genes") comprising a 5' 
regulatory region from an oleosin gene v;hich directs 
seed specific expression operably linked to the coding 
sequence of a heterologous gene such that the 
regulatory element is capable of controlling 
expression of the product encoded by the heterologous 
gene. The heterologous gene can be any gene other 
Chan oleosin. If necessary, additional regulatory 
elements or parts of these elements sufficient to 
cause expression resulting in production of an 
effective amount of the polypeptide encoded bv the 
lie terologous gene are included in the chimeric 
cons t rue t s . 



chimeric genes com.prising sequences of the oleosin 5' 
regulatory region that confer seed - spec i f i c expressic 
which are operably linked to a sequence encoding a 
heterologous gene such as a lipid metabolism, enzw.e. 



me thy 1 - umbel 1 i f erone . 



Histochemical localization for 



Accordingly, the present 



i n ve n t i on p r o v ides 
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Examples of lipid metabolism genes useful for 
practicing the present invention include lipid 
desaturases such as A6 - desa turases ; Al 2 - desa turases , 
Al 5 - desa turases and other related desacurases such as 
stearoyl-ACP desaturases, acyl carrier proteins 
(ACPs), thioes terases , acetyl transacylases , acetyl- 
coA carboxylases, ketoacyl - synthases , malonyl 
transacylases , and elongases . Such lipid metabolism 
genes have been isolated and characterized from a 
number of different bacteria and plant species. Their 
nucleotide coding sequences as v/ell as methods of 
isolating such coding sequences are disclosed in the 
published literature and are v/idely available zo those 
of skill in the art. 

In particular, the A6 - desa turase genes 
disclosed in U.S. Patent No. 5,552,306 and 
applicants' copending U.S. Application Serial No. 
08/3'66,779 filed December 30, 1994 and incorporated 
herein by reference, are contemplated as lipid 
metabolism, genes particularly useful in the practice 
of the present invention. 

The chimeric genes of the present invention 
are constructed by ligating a 5' regulacory region of 
a oleosin genomic DNA to the coding sequence of a 
heterologous gene. The juxtaposition of chese 
sequences can be accomplished in a ve.riety of v;ays . 
In a preferred em.bodim.ent the order of the sequences, 
from 5' to 3 • , is an oleosin 5' regulatory region 
(including a prom.oter) , a coding sequence, and a 
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terminanion sequence v/hich inr^i,,- 
site. mciuaes a polyad 

Standard techniauec; f^>- 
Chimeric ,e„e. are wel. .1,; r"^""^"^^"^- «' 

in ..e c., 1 °— ^ 

available for l.^ating fragments 0= n, , 

Which depend, on .he natnre of .h- .I-^'^ I,"^"^" °^ 
fragments. One of ordinary sUll . r^T:: 
recognizes that in orde.- fo^ fh= ■ 

be expre.sed, ,he con.^l.^: ^^1::^:"°'°'°^' ^ 

elements and signals for ef;°"ie:^: 

-anscrip.. Accordr„gi;:\t';;: iTHr"""" °^ 
regulatory region tha. contains tr/.^ons'e'ns, 
seguence known as the TATA bo-, c-'" P>^°"'°ter 
to a pro.oterless heterologous- co'd'!:: ^e^'.tf^""'^^ 

contarn tnl^^ lo^^^I tr,:,! !r^--' 

•^i-cc.ion to a promoterless hp------^ 

as the -odinrr ne. = _oxogous gene such 

>-iie .,oaing sequence of 8-aTi-r-, ^ - 

sKilled artisan win ^ - ^-^c. aase (GUS). The 

nisc.i will recognize tha- ^he subiRcr 
o-eosi., 0 regulatory regions can 

rnp = nc y-'-'i-s Ccjh 0= provided bv oth^^ 

--neans, ror exanple chemical o.- er - 

■-he 3. end of a heterologous ^od ^r^:; 
optionally ligated to a '.^^Z^ 

comprising a Polyadenyl a tion site. =-=;ot" . 

not li^.ited to, the nopalina synthasT^o 
= or the octopine T-DW. geL ^ "o'l-^-H 

Site. Alternatively, the oo'vade^-ra ^ 
provided by the heterologous ^n:'"' "" 
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The present: invention also provides methods 
Ol increasing levels of heterologous genes in plant 
seeds. In accordance with such methods, the subject 
expression cassettes and expression vectors are 
introduced into a plant in order to effect expression 
of a heterologous gene. For example, a method of 
producing a plant with increased levels of a product 
of a fatty acid synthesis or lipid metabolism gene is 
provided by transforming a plant cell with an 
expression vector comprising an oleo.sin 5' regulatory 
region operably linked to a fatty acid synthesis or 
lipid metabolism gene and regenerating a plant with 
increased levels of the product of said fatty acid 
synthesis or lipid metabolism gene. 

Another aspect of the present invention 
provides methods of reduci.ng levels of a product of a 
gene v/nich is native to a plant v/hich com,prises 

ran,s forming a plant cell v/ith an expression vector 
comprising a subject oleosin regulacory region 
operably linked to a nucleic acid sequence which is 
complem.en tai-y to the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
antisense regulation. Thus, for example, levels of a 
product of a fatty acid synthesis gene or lipid 
m.etabolism gene are reduced by trans formting a plant 
with an expression vector comprising a subject oleosin 
5' regulatory region operably linked to a nucleic acid 
sequence which d.s compl em.en tary to a nucleic acid 
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sequence coding for a native fatty acid synthesis or 
lipid metabolism gene. 

The present invention also provides a method 
of cosuppressing a gene which is native to a plant 
v/hich comprises transforming a plane cell v/ith an 
expression vector comprising a subject oleosin 5' 
regulatory region operably linked to a nucleic acid 
sequence coding for the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
cosuppression . Thus, for example, levels of a product 
of a fatty acid synthesis gene or lipid metabolism 
gene are reduced by transforming a plant with an 
expression vector comprising a subject oleosin 5' 
regulatory region operably linked to a nucleic acid 
sequence coding for a native fatty acid synthesis or 
lipid metabolism gene native to the plant. Although 
the -exact mechanism of cosuppression is not completely 
understood, one skilled in the art is fam.iliar v-ith 
published works reporting the experimental conditions 
and results associated with cosuppression (Napoli et 
al. 1990 The Plant Cell 2:210 -289 ; Van der Krol 1990 
The Plant Cell 2:291-299. 

To provide regulated expression of the 
heterologous or native genes, plants are transformed 
vyith the chimeric gene constructions of the invention. 
Methods of gene transfer are well known in the art. 
The chimeric genes can be introduced into plants by 
leaf disk transformation - regeneration procedure as 
described by Horsch et al . 1985 Science 227:1229. 
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Other methods of transformation such as procoplast 
culcure (Horsch et al . 1984 Science 223:496, DeElock 
et al . 1984 EMBO J, 2:2143, Barton et al . 1983, Cell 
32:1033) can also be used and are within the scope of 
this invention. In a preferred em.bodiment, plants are 
transformed with Agrobacceri u.t?- derived vectors such as 
those described in Klett et al . (1987) Anna, Rev, 
Plant Physiol . 38; 467, Other well-knov^n methods are 
available to insert the chimeric genes of the present 
invention into plant cells. Such alternative m.ethods 
include biolistic approaches (Klein et al . 1987 Nature 
327:10) , electropora ticn , chemically - induced DNA 
uptake, and use of viruses or pollen as veccors. 

When necessary for the transformation 
method, the chimeric genes of the present invention 
can be inserted into a plant transf orm.ation vector, 
e.g. the binary vector described by Sevan, M. 1984 
Nucleic Acids Res, 12 : Sill - 8721 . Plant transformation 
vectors can be derived by modifying the natural gene 
transfer system, of Aqrobac terium tumefaciens. The 
natural system comprises large Ti ( tum.or - inducing ) - 
plasmids containing a large segment, knov;n as T-DNA, 
v/hich is transferred to transformed plants. Another 
segment of the Ti plasmid, the vir region, is 
responsible for T-DNA transfer. ' The T - DMA region is 
bordered by terminal repeats. In the modified binary 
vectors, the tumor inducing genes have been deleted 
and the functions of the vir region are utilized to 
transfer foreign DMA bordered by the T - DMA border 
sequences. The T- region also contains a selectable 
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marker for antibiotic resistance, and a multiple 
cloning site for insert ing ■ sequences for transfer. 
Such engineered strains are knov/n as "disarmed" A. 
tumefaciens strains, and allow the efficienc transfer 
of sequences bordered by the T- region into the nuclear 
genome of plants. 

Surf ace- sterilized leaf disks and other 
susceptible tissues are inoculated with the "disarmed" 
foreign DMA - con taining A. tu/7?efaciens , cultured for a 
number of days, and then transferred to antibiotic- 
containing medium. Transformed shoots are then- 
selected after roocing in medium containing the 
appropriate antibiotic, and transferred to soil. 
Transgenic plants are pollinated and seeds from these 
plants are collecced and grovm on antibiotic medium.. 

Expression of a heterologous or reporter 
gene in developing seeds, young seedlings and mature 
plants can be moninored by immunological, 
hiscochemical or activity assays. As discussed 
herein, the choice of an assay for expression of the 
chimeric gene depends upon the nature of the 
heterologous coding region. For example, Northern 
analysis can be used to assess transcription if 
aooropriate nucleotide probes are available. If 
antibodies to the polypeptide encoded by the 
heterologous gene are available, Western analysis and 
immiunohis tochemiical localization can be used to assess 
the production and localization of the polypeptide. 
Depending upon the heterologous gene, appropriate 
biochemical assays can be used. For exam.ple. 
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cicetyl transferases are detected by measuring 
acetylar-ion of a standard substrate. The expression 
of a lipid desaturase gene can be assayed by analysis 
of fatty acid methyl esters (FAMES) . 

Another aspect of the present invention 
provides transgenic plants or progeny of these plants 
containing the chimeric genes of the invention. Both 
monocotyledonous and dicotyledonous plants are 
contemplated. Plant cells are transformed with the 
chimeric genes by any of the plant transformation 
methods described above. The transformed plant cell, 
usually in the form of a callus culture, leaf disk, 
explant or whole plane (via the vacuum infiltration 
method of Bechtold et al . 1993 C.R. Acad. Sci . Paris, 
315:1194-1199) is regenerated into a complete 
transgenic plant by methods well-knov;n to one of 
ordinary skill in the art (e.g. Horsch et al . 1985 
Sci'ence 227:1129) . In a preferred embodiment, the 
transgenic plant is sunflower, cotton, oil seed rape, 
maize, tobacco, Arabidopsis. peanut or soybean. Since 
progeny of transformed plants inherit the chimeric 
genes, seeds or cuttings from transformed plants are 
used to maintain the transgenic line. 

The follov.'ing examples further illustrate 
the invention. 
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EXAMPLE 1 

Isolation of Membrane - Bound Polysomal 
RNA and Construction of Borage cDNA Library 

Membrane - bound polysomes v;ere isolated from 
borage seeds 12 days post pollination (12 DP?) using 
the protocol established for peas by Larkins and 
Davies (1975 Plant Phys , 55: 749-756). RNA was 
extracted from the polysomes as described by Mechler 
(19 8 7 Methods in Enzymology 152: 241-243, Academ.ic 
Press) . Poly-A' RNA was isolated from the mem.brane 
bound polysomal RNA using Oligot ex - dT^--* beads (Qiagen) . 

Corresponding cDNA was made using 
Stratagene's ZAP cDNA synthesis kit. The cDNA library 
v;as constructed in the lambda ZAP II vector 
(Stratagene) using the lambda ZAP II kit. The primary 
library v/as packaged with Gigapack II Gold packaging 
extract (Stratagene) , 
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EXAMPLE 2 

Isolation of a A- 6 Desaturase cDNA from Borage 

Hvbr idi za 1-. ion protocol 

The amplified borage cDNA library v;as plated 
ai: low density (500 pfu on 150 min petri dishes) . 
Highly prevalent seed storage protein cDNAs v/ere 
reduced (subtracted from the total cDHAs) by screening 
with the corresponding cDNAs . 



borage cDNA library were generated by using random 
primed DhlA synthesis as described by Ausubel et al 
(19 94 Current Protocols in Molecular Biology , VJiley 
In terscience , N.Y.) and corresponded zo previously 
identified abundantly expressed seed storage protein 
cDNAs . Unincorporated nucleotides were removed by use 
of a G-50 spin column (Boehringer iManheim) , Probe was 
dena'cured for hybridization by boiling in a water bath 
for 5 minutes, then quickly cooled on ice. 
nitrocellulose filters carrying fixed recombinant 
bacteriophage were prehybr idized at 60 "C for 2-4 hours 
in hybridization solution [4X SET (600 mJ-'i NaCl , 80 mM 
Tris-HCl, 4 mlA Ma.EDTA; pH 7.8), 5X Denhardt's reagent 
(0.1% bovine serum, albumin, 0 ..1% Ficoll, and 0.1% 
polyvinylpyrolidone) , 100 ug/ml denatured salmon sperm. 
DNA, 50 pg/ml polyadenine and 10 ug/ml polycy t idine] . 
This v/as replaced with fresh hybridization solution to 
v;hich denatured radioactive probe (2 ng/ml 
hybridization solution) was added. The filters were 
jncubated at: 60''C with agitation overnight:. Filters 



Hybridization probes for screening the 
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v/ere v^ashed sequentially in 4X, 2X, and IX SET (150 mlA 
NaCl, 2 0 mM Tris-HCl, 1 mM Na. EDTA; pH7.8) for 15 
minutes each at 60'''C. Filters v;ere air dried and then 
exposed to Xl-ray film for 24 hours with intensifying 
screens at -30''X. 

Non - hybridizing plaques were excised using 
Stratagene's excision protocol and reagents. 
Resulting baccerial colonies were used to inoculate 
liquid cultures and were either sequenced manually or 
by an A3I automated sequencer. 

Random Sequencing of cDMAs from a Boraoe Seed 12 fOPP) 
Membrane - Bound Polysomal Library 

Each cDNA corresponding to a non- 

hybridizing plaque ivas sequenced once and a sequence 

tag generated from. 200-300 base pairs. All sequencing 

v;as performed by cycle sequencing (Epicentre) . Over 

300 expressed sequence tags (ESTs) were generated. 

Each' sequence tag v/as com.pared to the GenBank database 

using the BLAST algorithm (Altschul et al . 1990 J, 

Mol . Biol. 215:403-410) . A. number of lipid mecabolism. 

genes, including the A6 - desa turase were identified. 

Database searches with the cDNA clone 

designated mbp-65 using BLASTX v/ith the GenBank 

database resulted in a significant match to the 

previously isolated Synechocystis A6 - desacurase . It 

was determined however, that mbp - 6 5 was not a full 

lengch cDNA . A full length cDNA v;as i solaced using 

mbp- 65 to screen the borage m.embrane - bound polysomal 

library. The resultant clone was designated pAI\l and 

the cDMA insert of pANl was sequenced by the cycle 
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sequencing method. The amino acid sequence deduced 
from the open reading frame (Fig. 1, SEQ ID MO : 1 ) ivas 
compared to other knov/n desaturases using Genev/orks 
( IntelligGenetics) protein alignment program.. This 
alignment indicated that the cDNA insert of pANl was 
the borage A6 - desa turase gene. 

The resulting dendrogram (Figure 2) shows 
that A*'' - desaturases and A^" - desa turases comprise two 
groups. The newly isolated borage sequence and the 
previously isolated Synechocys tis A" - desa turase (U.S. 
Patent No. 5,552,306) formed a third distinct group. 
A comparison of amino acid motifs common to 
desaturases and thought to be involved ca taly t ical ly 
in metal binding illustrates the overall similarity of 
the protein encoded by the borage gene to desaturases 
in general and the Synechocys tis A'' - desa turase in 
particular (Table 1) . At the same tirrie, comparison of 
the motifs in Table 1 indicates definite differences 
between ::his procein and other plant desaturases. 
Fur therm^ore , the borage sequence is also distinguished 
from, knov/n plant membrane associated fatty acid 
desaturases by nhe presence of a hem.e binding motif 
conserved in cycochrom.e b^ proteins (Schm.ict et al . 
1994 Plant Mol . Biol, 25:631 - 642) (Figure '1 5 . Thus, 
v/hile these results clearly suggested that the 
isolated cDNA was a borage A'' - desa turase gene, further 
confirmation was necessary. To confirm the identity 
of the borage A6 - desa turase cDNA, the cDNA insert from 
pANl v/as cloned into an expression cassette for stable 
expression. The vector p3I121 (Jefferson et al . 1987 
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EMBO J. 6:3901-3907) was prepared for ligation by 
digestion v/ich BamHI and EcoICR I (an isoschizomer of 
Sad v/hich leaves blunt ends; available from Promega) 
which excises the GUS coding region leaving the 35S 
promoter and NOS terminator intact. The borage A^- 
desaturase cDMA was excised from the recombinant 
plasmid (pAMl) by digestion v/ith BamHI and Xhol . The 
XhoT end was made blunt by performing a fill-in 
reaction catalyzed by the Klenov/ fragment of DMA 
polymerase I. This fragment was then cloned into the 
BamHI/ScoICR I sites of pBI121.1, resulting in the 
plasmid pAIv2 . 
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EXAl^PLE 3 



Production of Transgenic 
Plants and Preparation and 
Analysis of Fatty Acid Methyl Esters (FAi^lEs) 



The expression plasmid, pAM2 was used to 
transform tobacco {Nicotiana tahacum cv. xanthl) via 
.Agrobac teri uiTj tumefaciens according to standard 
procedures (Horsch, et al . 1985 Science 227:1229-1231; 
Bogue et al . 19 9 0 Mol . Gen. Genet. 221:49-57) except 
that the initial transf ormants were selected on 100 
ug/:al kanamycin . 

Tissue from transgenic plants was frozen in 
liquid nitrogen and lyophilized overnight. FMAEs were 
prepared as described by Dahmer; et al . (1989) J. 
Amez". Oil. Chem. Sac. 66: 543-548. In some cases, the 
solvent v;as evaporated again, and the FAMEs were 
resuspended in ethyl acetate and extracted once with 
deionized v/ater to remove any water soluble 
contaminants. FAMEs were analyzed using a Tracer- 560 
gas liquid chroma tograph as previously described 
(Reddy et al , 1996 Nature Biotech. 1 4/ : 6 3 9 - 64 2 ) . 

As shovvn in Figure, 3, transgenic tobacco 
leaves containing the borage cDNA produced bo"h GLA 
and octadecatetraenoic acid (OTA) (18:4 A6,9,12,15). 
These results thus dem.onstrate that the isolaced cDNA 
encodes a borage A6 - desacurase . 
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EXAMPLE 4 

Expression of A6 - desaturase in Borage 

The native expression of A6 - desaturase was 
examined by Northern Analysis of RNA derived from 
borage tissues. RNA v/as isolated from developing 
borage embryos following the method of Chang et al . 
1993 Plant Mol . Biol. Rep. ii: 113 -116. RNA was 
electrophoretically separated on formaldehyde - agarose 
gels, blotted to nylon membranes by capillary 
transfer, and imunobil i zed by baking at 00°C for 30 
minutes follov;ing standard protocols (Brown T., 1996 
in Current Protocols in Molecular Biology, eds . 
Auselbel, et al . [Greene Publishing and Wiley- 
Incerscience , New York] pp. 4.9.1-4.9.14.). The 
filcers v.^ere preincubated at 42''C in a solution 
concaininc 50% deionized formamide, 5X Denhardt's 
reagenc, 5X SSPE (9 00 rr-M NaCl; 5,0mM Sodium phosphate, 
pH7.7; and 5 mM EDTA) , 0.1% SDS , and 200 \iq/ml 
denatured salmon sperm DNA . After tv;o hours, the 
filters v/ere added to a fresh solution of the same 
composition with the addition of denatured radioactive 
hybridization probe. In this instance, the probes 
used v/ere borage legumin cDNA (Fig. 16A) , borage 
oleosin cDNA (Fig. 15B) , and borage A6 - desaturase cDNA 
(pANl, Example 2) (Fig. 16C). The borage legumin and 
oleosin cDNAs were isolated by EST cloning and 
identified by comtparison to the GenBank 'dacabase using 
the BLAST algorithm as described in Example 2. 
Loading variation was corrected by normalizing to 
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levels of borage EFla mRNA . EFla ni.RMA v/as identified 
by correlating to the corresponding cDMA obtained by 
the EST analysis described in Example 2. The filters 
v/e^e hybridized at 42''C for 12-20 hours, then v;ashed 
as described above (except that the teip.pera ture v/as 
6S^C) , air dried, and exposed to X-ray film. 

As depicted in Figs. 15A and 15B, a6- 
desacurase is expressed primarily in borage seed. 
Borage seeds reach maturation between 18-20 days post 
pollination (dpp) . A6 - desa turase mRNA expression 
occurs throughout the time points collected (8-20 
dpp), but appears maximal from 10-16 days post 
pollination. This expression profile is similar to 
that seen for borage oleosin and 12S seed storage 
protein mRNAs (Figs. 16A, 163, and 16C). 
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STRATAGSNE) or using oligotex-dT latex particles 
(QIAGEN) . 

Construction of tissue - specif ic cDNA libraries 

Flov/er, one day silique, three day silique, 
leaf, root, and developing seed cDNA libraries v;ere 
each constructed from 5 \iq poly A-^- RN using the ZAP 
cDNA synthesis kit (Stratagene) . cDNAs v/ere 
directionally cloned into the EcoRI and Xhol sites of 
pBluescript SK( -) in the A-ZAPII vector (Short: et al . 
1988 Nucleic Acids Res, 16:7583-7600). Nonrecombinan t 
phage plaques v;ere identified by blue color 
development on MZY plates conraining X-gal (5 bromo-4- 
chloro - 3 - indoyl - B - D - galac topyranoside ) and IPTG 
( isopropyl - 1 - thio- p - D- galactopyranoside) . The 
nonrecombinan t backgrounds for the flov/er, one day 
silique, three day silique, leaf, root, and developing 
seed' cDNA libraries Vv^ere 2.8%, 2%m 3.3%, 6.5%, 2.5%, 
and 1.9% respectively. 

Random, criming DNA labeling 

The cDNA insercs of isolated clones 
(unhybridized cDNAs) were excised by EcoRI/XhoI double 
digestion and gel -purified for. random priming 
labeling. Klenow reaction mixture connained 50 ng DNA 
templates, 10 mM Tris-HCl, pH 7 . 5 , 5 mM MgCl., 7 . 5 m^^ 
DTT, 50 uM each of dCTP, dGTP, and dTTP, 10 uH hexamer 
random, primbers {Boehringer Mannheim.), 50 \iCi c('32 P- 
gATP, 3000 Ci/mmole, 10 mCi/mtl (DuPont) , and 5 units 
of DNA polym:erase I Klenov/ fragment (Nev/ England 
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Biolabs) . The reactions were carried out at 37''C for 
one hour, /-^liquots of diluted reaction mixtures v;ere 
used for TCA precipitation and alkaline denaturing gel 
analysis. Hybridization probes were labeled only with 
Klenow DMA pol^^'Taerase and the unincorporated dNTPs 
were removed using Sephadex R G-50 spin columns 
(Boehringer Mannheim) . 

Random Pr.imed PGR 

Double- stranded cDNA was synthesized from 
poly A-f RMA isolaced from Arahidopsis root tissue 
using the cDMA Synthesis System (GIBCO BRL) with oligo 
dT12-18 as primers. cDMAs longer than 300 bp were 
enriched by Sephacryl S-400 column chromatography 
(Stratagene) . Fractionated cDNAs were used as 
uem.plates for RP-PCR labeling. The reaction contained 
10 xm Tris-HCl. ph 9.0, 50 mlvi KCl, 0.1% Triton X-100, 
2 nrM-MgC12, 5 units Tag DMA polymeras (PROMEGA), 200 
uM dCTP, cGTP, and dTTP, and different concentrations 
of hexamer random primers a-32P dATP, 800 mCi/mmole, 
10 mCi/ml (DuPont) , and cold dATP in a final volume of 
25 ul. Afcer an initial 5 mdnuces ar, 95"^^ different 
reactions were run through different programs to 
optimize RP-PCR cDNA conditions. Unless otherwise 
indicated, the following program was used for most RP- 
PCR CDMA probe labeling: 95'^C/5 minutes, then 40 
cycles of 95''C 30 seconds, 18^C/1 second, ramp to 30^C 
at a rate of 0 . 1 ""C/second . 72^C/1 minute. RP-PCR 
products were phenol /chloroform extracted and ethanol 
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precipitated or purified by passing through Sephadex 
G-50 spin columns (Boehringer Mannheim). 

Clone blot", virtual subtraction 

Mass excision of X-ZAP cDNA libraries was 
carried out by co - infecting XLl-3iue HRF' host cells 
v;ith recombinant phage from the libraries and ExAssist 
helper phage (STRATAGENE) . Excised phagemids v/ere 
rescued by SOLR cells. Plasmid DNAs v/ere prepared by 
boiling mini -prep method (Holmes et al . 1981 Anal. 
Biochem. 114:193-197) from randomly isolated clones. 
cDNA inserts were excised by EcoRI and Xhol double 
digestion, and resolved on 1% agarose gels. The DNAs 
were denatured in 0.5 M NaOH and 1.5 m NaCl for 4 5 
minutes, neutralized in 0.5 M Tris-HCl, pH 8.0, and 
1.5 M NaCl for 45 minutes, and then transferred by 
blotcing to nylon membranes (Micron Separations, Inc.) 
in I'OX SSC overnight. After one hour prehybr idi zat ion 
at 65''C, root RP-cDNA probe was added to the sam.e 
hybridization buffer containing 1% bovine albumin 
fraction V (Sigm.a), 1 mJvl EDT7\, 0.5 M NaHP04 , pH 7.2, 
7% SDS. The hybridization continued for 24 hours at 
65^'C. The filters were washed in 0.5% bovine album.in, 
1 miM EDTA, 40 m>I NaHP04 , pH 7.2, 5% SDS for ten 
minutes at room temperature, and 3 x 10 minutes in 1 
mM EDTA, 4 0 ml4 NaH?04 , pH 7 . 2 , 1% SDS at 6 5°C. 
Au toradiographs v/ere exposed to X-ray films (Kodak) 
for tv;o CO five days at -80'^C. 

Hybridization of resulting blots with root 
RP-PCR probes "virtually subtracced" seed cDNAs shared 
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v/ith the root mRNA population. The remaining seed 
cDMAs representing putative seed - speci f ic cDNAs , 
including those encoding oleosins, v/ere sequenced by 
the cycle sequencing method, thereby identifying AtS21 
as an oleosin cDMA clone. 

Sequence analysis of AtS21 

The oleosin cDNA is 834 bp long including an 
18 bp long poly A tail (Fig. 4, SEQ ID NO : 2 ) It has 
high homology to other oleosin genes from Arahidopsis 
as well as from, other species. Recently, an identical 
oleosin gene has been reported (Zou, e- al . , 1996, 
Plant Mol , Biol . 31:429-433). The predicted protein is 
191 amino acids long with a highly hydrophobic middle 
domain flanked by a hydrophilic domain on each side. 
The existence of tv;o upstream in frame stop codons and 
the similari ty to other oleosin genes indica te that 
this' cDNA is full-length. Since there are tv-o in frame 
stop codons just upstream, of the first ATG, this cDNA. 
is considered to be a full length cDNA (Figure 4, SEQ 
ID MO : 2 ) . The predicted protein has three disninctive 
domains based on the distribution of its amino acid 
residues. Both the N-term^inal and C-term.inal dom.ains 
are rich in charged residues v;hile the central domain 
is absolutely hydrophobic (Figure 5). As many as 20 
leucine residues are located in the central dom.ain and 
arranged as repea t s wi th one leucine occurring every 
7-10 residues. Other non-polar amino acid residues 
are also clustered in the central domain m.aking this 
domtain absolutely hydrophobic (Figure 6) . 
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Extensive searches of different databases 
using both AtS21 cDNA and its predicted protein 
sequence identified oleosins from carrot, maize, 
cotton, rapeseed, Arahidopsis , and other plant 
species. The homology is mainly restricted to the 
central hydrophobic domain. Seven Arahidopsis oleosin 
sequences v;ere found. AtS21 represents the sarr.e gene 
as Z54164 which has a fev7 more bases in the 5' 
untranslated region. The seven Arahidopsis oleosin 
sequences available so far v/ere aligned to each other 
(Figure 7) . The result suggested that the seven 
sequences fall inco three groups. The firsc group 
includes AtS21 (SEQ ID NO:5), X91918 (SEQ ID NO : 6 ) , 
and the partial sequence Z29859 (SEQ ID MO : 7 ) . Since 
X91918 (SEQ ID MO: 6) has only its last residue 
different from AtS21 (SEQ ID MO : 5 ) , and since Z29859 
(SEQ ID NO: 7) has only three amino acid residues v/hich 
are .different from AtS21 (SEQ ID MO : 5 ) , all three 
sequences likely represent the same gene. The two 
sequences of the second croup, X52352 (SEQ ID MO: 8) 
and A.tol3 (SEQ ID MO : 9 ) , are different in both 
sequence and length. Thus, there is no doubr that 
they represent tvjo independent genes. Like the first 
group, the two sequences of the third group, X9i956 
(SEQ ID MO: 10) and L40954 (SEQ ID NO : 1 1 ) , also have 
only three divergent residues which may be due to 
sequence errors. Thus, X91956 (SEQ ID MO:10) and 
L40954 (SEQ ID MO : 1 1 ) likely represent the same gene. 
Unlike all the other oleosin sequences which were 
predicted from cDMA sequences, X62352 {SEQ ID MO : 8 ) 
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EXAMPLE 6 

Characterization of Oleosin 
Genomic Clones and Isolation of Oleosin Promoter 

Genomic clones were isolated by screening an 
Arabidopsis genomic DNA library using the full length 
cDNA (AtS21)as a probe. Two genomic clones v;ere 
mapped by restriction enzyme digestion follov/ed by 
Southern hybridization using the 5' half of the cDNA 
cleaved by Sad as a probe. A 2 kb Sad fragment was 
subcloned and sequenced (Fig. 9, SEQ ID NO:35). Two 
regions of the genomic clone are identical to the cDNA 
sequence. A 395 bp intron separates the tv;o regions. 

The copy number of AtS21 gene in the 
Arabidopsis genom.e was determined by genomic DNA 
Southern hybridization follov^ing digestion v;ith che 
enzymes BamHI, EcoRI , Hindlll, Sad and Xbal, using 
the full length cDNA as a probe (Figure 8B) . A single 
band' was detected in all the lanes except Sad 
digestion where two bands were detected. Since the 
cDNA probe has an internal Sad site^ these results 
indicated that AtS21 is a single copy gene in the 
Arabidopsis genom.e. Since it has been knov/n that 
Arabidopsis genome contains different isoforms of 
oleosin genes, this Southern analysis also 
demonstrates that the different oleosin isoforms of 
Arabidopsis are divergent at the DNA sequence level. 

Two regions, separated by a 39 5 bp intron, 
of the genomic DNA fragment are identical to AtS21 
cDNA sequence. Database searches using the 5' 
promoter sequence upstream, of AtS21 cDNA sequence did 
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nou identify any sequence v;ith significant homology. 
Furthermore, the comparison of AtS21 promoter sequence 
ivith another Arabidopsis oleosin promoter isolated 
previously ( Van Rooijen, et al . , 1992) revealed 
little similarity. The AtS21 promoter sequence is 
rich in A/T bases, and contains as many as 44 direct 
repeats ranging from 10 bp to 14 bp v/ith only one 
mismatch allov^ed. Tv;o 14 bp direct repeats, and a 
putative ABA response element are underlined in Figure 
9 . 
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EXAMPLE 7 

Construction of AtS21 
Promoter/GUS Gene Expression Cassette and Expression 
Patterns in Transgenic Arabidopsis and Tobacco 



Construction of AtS21 oromocer /GUS gene expression 
casset tie 

The 1267 bp promoter fragmenc starting from 
the first G upstream of the ATG codon of the genomic 
DMA fragm.ent v/as amplified using PGR and fused to the 
GUS reporter gene for analysis of its activity. 
The promoter fragment of the AtS21 genomic clone was 
amplified by PGR using the T7 primer 

GTAATACGACTCACTATAGGGC (SEQ ID NO : 1 3 ) and the 21? 
primer GGGGATGGTATAGTAAAACTATAGAGTAAAGG (SEQ ID NO: 14) 
complementary to the 5' untranslated region upstream 
of the first ATG codon (Figure 9). A BamHI cloning 
sice was introduced by the 21P primmer. The amplified 
fragment was cloned into the BamHI and Sad sices of 
oBluescript KS ( S tracagene) . Individual clones were 
sequenced to check possible PGR mucations as v;ell as 
the oriencaticn of cheir inserts. The correct clone 
was digested v;ith BamHI and Hindlll, and the excised 
promoter fragment (1.3 kb) was cloned inco the 
corresponding sites of pBIlOl.l (Jefferson, R.A. 
1987a, Plant Mol . Biol. Rep, 5:387-405; Jefferson et 
al., 1987b, EM30 J, 5:3901-3907) upstream of the GUS 
gene. The resulcant plasmid was designated pAM5 (Fig. 
10). The AtS21 promoter/GUS construct (pAN5) was 
incroduced into both tobacco (by the leaf disc mechod, 
Horsch ec al., 1985; Rogue et al . 1990 Mol, Gen, Gen. 
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lauryl sarcosine. The tissue debris was removed by 5 
minutes cen tr i f uga tion in a microfuge. The 
supernatant v/as aliquoted and mixed with substrate and 
incubated at 21'^C for 1 hour. Three replicas v;ere 
assayed for each sample. The reactions v;ere stopped 
by adding 4 volumes of 0.2 M sodium carbonate. 
Fluorescence was read using a TKO-100 DNA fluorometer 
(Hoefer Scientific Instruments) . Protein 
concentrations of the extracts v^ere determined by the 
Bradford m^ethod (Bio Rad) . 

Expression patterns of AtS21 promoter /GIJS in 
transgenic Arahidoos i s and tobacco 

In Arahidopsis , GUS activity v^as detected in 
green seeds, and node regions v/nere silicues, cauline 
leaves and branches join the inflorescence stem 
(Figures llA and IIB) . No GUS activity was detected 
in any leaf, root, flower, silique coat, or the 
internode regions of the inflorescence stemi. Detailed 
studies of the GUS expression in developing seeds 
revealed that the AtS21 promoter was only active in 
green seeds in which the em.bi^yos had already developed 
beyond heart stage {Figures IIC and IIG) , The 
youngest e.mbryos shov/ing GUS activity that could be 
detected by hi s tochemical staining w^ere at early 
torpedo stage. Interestingly, the staining v/as only 
restricted to the lower part of the embryo including 
hypocotyl and emroryonic radical. No staining was 
detected in the young cotyledons (Figures IID and 
HE) . Cotyledons began to be stained when the embryos 
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v/ere at: late torpedo or even early cotyledon stage 
(Figure IIF and IIH) . Later, the entire embryos were 
stained, and the staining became more intense as the 
embryos matured (Figures 111 and IIJ) . It was also 
observed that GUS gene expression was restricted to 
the embryos. Seed coac and young endosperm v^ere not 
stained (Figure IIC) . 

GUS activity was also detected in developing 
seedlings. Young seedlings of 3-5 days old were 
stained everywhere. Although som^e root hairs close to 
the hypocotyl were stained (Figure IIK) , most of the 
newly formed structures such as root hairs, lateral 
root primordia and shoot apex were not stained 
(Figures IIL and UN). Later, the staining was 
restricted to cotyledons and. hypocotyls when lateral 
roots grew from the elongating embryonic root. The 
staining on embryonic roots disappeared. No staining 
was observed on newly formed lateral roots, true 
leaves nor trichomes on true leaves (Figures IIM and 
UN) . 

AtS21 promoter/GUS expression patterns in 
tobacco are basically the sam.e as in Arabidopsis , GUS 
activity was only detected in late stage seeds and 
different node regions of mature plants. In 
germinating seeds, strong staining was detected 
throughout the entire embryos as soon as one hour 
after they were dissected from imbibed seeds. Mature 
endosperm, which Arabidopsis seeds do not have, but 
not seed coat was also stained (Figure 12A) . The root 
tips of some young seedlings of one transgenic line 
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wGi~e not stained (Figure 12B) . Otherv/ise, GUS 
expression patterns in developing tobacco seedlings 
v/ere the same as in Arabidopsis seedlings (Figures 
12B, 12C, and 12D) . Newly formed structures such as 
lateral roots and true leaves v/ere not stained. 

AtS21 mRNA levels in develooina seedlings 

Since the observed strong activities of 
AtS21 promoter/GUS in both Arabidopsis and tobacco 
seedlings are not consistent with the seed - speci f ic 
expression of oleosin genes. Northern analysis was 
carried out to determine if AtS2i m.RFA was present in 
developing seedlings where the GUS activity v/as so 
strong. RNAs prepared from seedlings at different 
stages from 24 hours to 12 days were analyzed by 
Norchern hybridization using AtS21 cDNA as the probe. 
Surprisingly, AtS21 mRNA v;as detected at a high level 
comparable to that in developing seeds in 24-48 hour 
imbibed seeds. The m.RNA level dropped dramatically 
when young seedlings first emerged at 74 hours 
(Figures 13A and 13B) . In 96 hour and older 
seedlings, no signal was detected even with a longer 
exposure (Figure 13B) . The loadings of RNA samples 
were checked by hybridizing the same blot with a 
tubulin gene probe (Figure 13C) which was isolated and 
identified by EST analysis as described in Example 2. 
Since AtS21 mRNA v;as so abundant in seeds, residual 
AtS21 probes rem.ained on the bleu even after extensive 
stripping. These results indicated that AtS21 mRNA 
detected in im^bibed seeds and very yoiing seedlings are 
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the carry-over of AtS21 mRNA f rorr. dry seeds. It has 
recently been reported chat an oleosin Atol2 mRNA 
(identical to /■itS21) is most abundant in dry seeds 
(Kirik, et al . . 1996 Plant Moi . Biol, 3 1 ( 2 ) : A13 - 4.11 , ) 
Similarly; the strong GUS activities in seedlings were 
most likely due to the carry-over of both [3- 
glucuronidase protein and the de novo synthesis of p- 
glucuronidase from its mRMA carried over from the dry 
seed stage. 
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EXAJ-IPLE 8 

Activity comparison between the 
AtS21 promoter and the 35S promoter 

The GUS activities in transgenic Arahidopsis 
developing seeds expressed by the At:S21 promoter were 
compared v;ith those expressed by the 3 5S promoter in 
the construct pBI221 (Jefferson et al . EMBO J. (5:3901 - 
3907) . The seeds were staged according to their 
colors (Table 2). The earliest stage was from 
globular to late heart stage when the seeds were still 
white but large enough to be dissected from the 
siliques. AtS21 promoter activity was detected at a 
level about three times lov;er ::han that of the 3 5S 
promoter at this stage. 35S promoter activity 
remained at the same low level throughout the entire 
embryo development. In contrast, AtS21 prom.ocer 
activity increased quickly as the embryos passed 
torpedo stage and reached the highest level of 25.25 
pmole 4-MU/min. ug procein at mature stage (Figure 5- 
8) . The peak activity of the AtS21 promoter is as 
much as 210 timies higher than its lov;est activity at 
globular to heart stage, and is close to 100 times 
higher than the 35S prom.oter activity at the same 
stage (Table 2) . The activity levels of the AtS21 
promioter are similar to those cf another Arabidopsis 
oleosin promoter expressed in Brassica napus (Plant et 
al. 1994, Plane mol . Biol. 25:193-205. AtS21 promoter 
activity v;as also detected at background level in 
leaf. The high standard deviation, higher than the 
average itself, indicated that the GUS activity was 
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, ipes (Table 2) • 
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T -^^q promoter ;.rtivicy was 

promoter and 3bS P promoter actx 

Although the A tobacco th.n 

. 3 ,,„es lower in dry seed ^^^^^^^^^^ 
^^""^ ..dry seed, the aosolute „oter 

■^^■".rrh "han that exP^e-ed oV t-^^^^^_^^ 
"^lU tpsxs leaf (Table 2) ^ ^ ,,,, .p..ure 

A^-^ ,as observed x. 

promoter ac„ 

• -.on of the AtS21 promote • 

Comparxson ox T;=tter is '^o- ^ 

r.-.led that the la^^e ^gvelopxng 
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.0 express genes ■ ^ activitxes 
promoter .0 e. P ^^nsister.c lo- - 353 

Because of ^ts .g^^elopment pe^-O 

entxre embryo de expressxon 
throughout consistent lo.> - 

is usefux xu- vand the Atb-- 

promotex -s ^^^^3,- nano. ^^^^^ 

-^-^^ the dry seed --^^ 

accumulating unt ^^..cxent, xs be^.e- 

.^r although not er embryos prxo 

promoter, a ^^p^essxng genes 

AtS21 promoter 
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EXAMPLE 9 



B^,... ^^^^^ 

the Borage " ^e^aW^^^^^^^ comparison to 
Expression of the^ ^tS2l Promoter and^^ promoter 

S^resfion^nder the Control of 

- ,o create an expression construct 
in oraer to crea ^^pression of tne 

,,,tn the .tS21 ^--^^;J;";;rous coding fragment 
borage A5-desaturase gen 3^,1 and 

f.o. PAH5 was re^ovea by d.. ,,,,3 then 

HcoXCH I. The cm^ .nse- o P ^^^^ ^^^.^.^^^^ 

,,,,3ed by first ^^^^^^J^^^^^^,^^ , ,nd then digesting 

.^nal overhang as abo^e), replace 
the residual o fraament was asec 

,,,.^S.aI. -^--^^^:7,,3 yielding PA«3. 

::r:tn.tore.ass.^^^ 

of Z.^'-desc.turas- ^^^^^^^ ^g^ers o_ 

.he corresponding r atty -^^^ ^^^^^^ 

reaction products, V " - ^^.^^ ^.e^hods 

occadecatetraenoxc ac-.. - ^^^^ .^vels 

referred to xn ^xamP- 3- _ ^^^^^^^ e.r 



cransgenic seeds 1,.,.^ 



(Table j; — _ ^ ^^d 2 . B"^ u-.— 

CIS £attv acids (He.n - 3^^ ^^^^ ^^^^^^^^ 

leaves of .hese P^-"" pXa.ts produce. GL. 

p.o™ouer/,V-d.sa.urase .rans, ^^^^ 

"r^ra.n:— a.l.OT..nse.ds. 

(Mean - i- • ^ 
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EXAMPLE 10 

Transformation of Oilseed Rape With an Expression 
Cassette Which Comprises the Oleosin 5» Regulatory- 
Region Linked to the Borage Delta 6-Desaturase Gene 

Oilseed rape, Cv . V7escar, v;as transformed 
V7ith the strain of Agrobac cerium tumefaciens SHA105 
containing the plasmid pAN3 (i.e. the borage A6- 
desaturase gene under the control of the Arahidopsis 
oleosin promoter - Example 9) . 

Terminal incernodes of Westar were co- 
cultivated for 2-3 days v/ith induced Agrobac terium 
tumefaciens strain EHA105 (Alt-Moerbe et al . 1988 Moi . 
Gen. Genet. 213:1-8; James et al . 199 3 Plant Cell 
Reports 12:559-553), then transferred onto 
regeneration medium. (Boulter etal . 1990 Plant Science 
70:91-99; Fry et al . 1987 Plant Cell Reports 6:32i- 
325) . The regenerated shoocs v/ere transferred to 
growth m.edium (Pelletier et al . 1983 Mol.Gen. Menet. 
191:244 - 250), and a polym.erase chain reaction (PGR) 
iiest v/as performed on leaf fragments zo assess the 
presence of the gene. 

DNA v;as isolated from the leaves according 
CO the protocol of KI--1 Haymes et al . (1996) Plant 
Molecular Biology Reporter 14(3) : 280 - 284, and 
resuspended in lOOul of water, without RMase 
treatment. 5pl of extract v/ere used for the PGR 
reaction, in a final volume of SOpl . The reaction was 
performed in a Perkin - Elmer 9600 therm.ocycl er , v/ith 
the following cycles: 
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1 cycle: 95*''C, 5 minutes 

30 cycles: 95°C, 45 sec; Sl^'C, 45 sec 

1 minute 

1 cycle: 7 2''C, 5 minutes 

and the following primers (derived iirom near the metal 
box regions, as indicated in Fie. 1, SEQ. NO. :1) : 
5 ' TGG AAA TGG AAC CAT AA 3 ' 
' 5 ' GGA AAC AAA TGA TGG TC 3 ' 
Amplification of the DMA revealed the expected 549 
base pair PGR fragment (Figure 17) . 

The positive shoots v;ere transferred to 
elongation medium, then to rooting medium (DeBlock et 
• al 1989 Plant Physiol, 9 1 : 694 - 701) . Shoo ts v/ith a well- 
developed root system v.-ere transferred to the 
greenhouse. When plants v/ere v/ell developed, leaves 
v/ere collected for Southern analysis in order to 
assess gene copy number. 

Genomic DNA was extracted according to the 
procedure of Bouchez et al . (1995) Plant Moleculax: 
Biology Reporter 14:115-123, digested v/ith the 
restriction enzymes Bgl I and/or Cla I, 

electrophoretically separaced on agarose gel (Maniatis 
et al. 1982, in Molecular Cloning; a Laboratory 
.Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor /NY) , and prepared for transfer to nylon 
membranes (Nytran membrane, Schleicher & Schuell) 
according to the instructions of the manufacturer. 
DMA v/as then transferred zo membranes overnight by 
capillary action using 20XSSC (Maniatis et al . 1982). 
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, ,,e-p crosslinked by 

UV '"""■'^""V" s'ml o£ a solution conca.n.n. 
1 Uour at 65>X .n 1. ml ,^,,dtatea sKir« n,..-. 

eKSSC, O.^SBS anc, 2^2. / ..^p.i^ene, . Tne 

olass vials in hybrifli-tion 

" =. were hVoridized overn.gh- ^^^e 

membranes were .^natured hybridization P 

solution ° " specific activity ol 10 

r.diolabelled with P " the ReadyTo- 

cpn>/« W the random J^! ,„be represents 

CO Kit Obtained .ro.;na. -;-^^ ...esaturase ,ene 
, PCR Eragment oE ^'^^ „„,-, .ne prin-ers 

(Obtained in the ^ ,,,ion , the filters 

„«e washed at 65 ^ - J^^;^ „,,_,,,3s. The membranes 
and 0.2XSSC, 0 . 1«DS ..posed to KooaK 

were then wrapped - - screen at .^O'C in a 

film usm. an -t- ^-^^^^^ ^^^^ ^,„3,a.., 
light-prooi cassette. 

. ^ =.-n fhe presence ot 

T.e results obtained ---- ,,.,er 
according to the gene cons 
the gene. Acco digestea dv Sgi 

, each lane oi-- ul'^ ^ _ „,,,_,3se genes 

o£ bands m eaci f^igica b-aesaturase j 

the number of de-v.^ 
I represencs tne ^ The 

— ' ^" rrU ci: l t^ether generates a 

digestion with Bgl 
- fragment of 3435 bp- „oo.prising" is 
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defined as spe components as 

nne or more otne- 
addition of one ,j,,,eof. 

„ croups Tints'- = 
components, oi 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: Rhone Poulenc 7\gro 

Thomas, Terry L. 
Li , Zhongsen 



(ii) TITLE OF INVENTION: AN OLEOSIN 5' REGULATORY REGION FOR THE 

MODIFICATION OF PLANT SEED LIPID COMPOSITION 



(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Scully, Scott, Murphy & Presser 

(B) STREET: 400 Garden City Plaza 

(C) CITY: Garden City 

(D) STATE: Nev; York 

(E) COUNTRY: USA 

(F) ZIP: 11530 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS/MS - DOS 

(D) SOFTWARE: Pacentin Release #1.0, Version #1.30 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/831,575 

(B) FILING DATE: 9 April 1997 

(C) CLASSIFICATION: 



( V i i i ) ATTORNEY / AGENT I NFORM^.T ION : 

(A) NAME: DiGiglio, Frank S. 

(B) REGISTRATION NUMBER: 31,34 6 

(C) REFERENCE/DOCKET NUMBER: 10203 

( ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: (516) 742-4343 

(B) TELEFAX: (516) 742-4366 



(2} INFORi-'lATION FOR SEQ ID NO : 1 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DN.A (genomic) 

(ix) FEATURE: 

(A) NAIvIE/KEY: CDS 

(B) LOCATION: 43.. 1387 
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,,,, SEQUENCE DESCnXPTXON: SEQ XO MO : 1 - 

r .rCCTCCC/VA AGAGAGTAGT CATTTTTCAT 
ATATCTGCCT ACCC TCCU/v. 



ATC 
He 
5 

GGA 

Gly 



;^^G AAA TAG ATT 
Lys bys Tyr He 



GAT 
ASP 



TGG 
Trp 



GTG 
Val 



GGT 

Gly 



CAA 
Gin 



CTA TGG 
Leu Trp 

AAA GAG 
Lys Asp 
40 

GAG GTA 
Glu val 
55 



ATC 
He 
25 

CAT 
His 



ACC 
Thr 
10 

TCG 
Ser 



TCA GAT GAA CTC 
Ser ASP Glu beu 



CCA 
pro 



ACT 
Thr 



GAT 
ASP 



ATT CAA GGG 
lie Gin Gly 

GGT GGC AGC 

Gly Gly 

45 

GCA TTT GTT 
Ala Phe Val 
60 



AAA 
Lys 
30 

TTT 
Phe 



AAG 
Lys 
15 

GCC 
Ala 



CA ATG GGT GCT CAA 
' Met Ala Ala Gin 
1 

AAC CAC GAT AAA CCC 
Asn His ASP bys pro 

TAT GAT GTT TCG GAT 
ASP val ser Asp 



GCA 
Ala 



Trp 



Ser 
85 

TCT 
ser 



AAG 
Lys 
7 0 

GTT 
val 



AAT CTT 
Asn Leu 



GAT 
Asp 



AAG 
Lys 



TCT GAG GTT 
ser Glu Val 



AAA ATG 
Lys Met 



GGT TTG 
Gly Leu 
105 



TCT 
Ser 
90 

TAT 
Tyr 



TTT TTC ACT GGG 
Phe Phe Thr Gly 
75 

AAA GAT TAT 
Lys Asp Tyr 



AGG 
Arg 



CCC TTG AAG AGT CTT GCT 
pro Leu Lys Ser Leu Ala 

5 0 

rrTC CAT CCT GCC TCT ACA 
;he His pro Ala Ser Thr 
65 

TAT TAT CTT AAA GAT TAG 
lyr Tvr Leu Lys Asp Tyr 
80 

CTT GTG TTT GAG TTT 
ieu val Phe Glu Phe 



LVS 
9 5 



TTG 
Leu 



TGC TTT 
Cys Phe 



"ttt 

Phe 



TGT GAG 
cys Glu 
135 



ATA GCA 
lie Ala 
120 

GGT GTT 
Gly val 



ATG 
Met 



GAC AAA 
Asp Lys 

CTG TTT 
Leu Phe 



TTG 
Leu 



GTA CAT 
val His 
140 



AAA GGT 
Lys Gly 
110 

GCT ATG 
Ala Met 
125 

TTG TTT 
Leu Phe 



CA.T 
His 



AGT 
Ser 



TCT 
Ser 



TTT 
Phe 



CTT TGG 
Leu Trp 
150 



ATT GAG 
lie Gin 



AGT 
Ser 



ATG 
Met 
165 

GCA 
Ala 



GTA GTG 
val val 



TCT GAT 
Ser Asp 



;VAT TGT CTT TCA 
Asn cys Leu Ser 



TCA 

Ser 
170 

GGA 

Gly 



GGT TGG 
Gly Trp 
155 

AGG CTT 
Arg Leu 



ATT GGA 

lie Gly 

AAT AAG 
Asn Lys 



pjTA AGT ATT GGT 
lie ser He Gly 



CAT 
His 



TTT 
Phe 
115 

TGG 
Trp 



ATT ATG TTT GCA ACT 
lie >5et Phe Ala Thr 
115 

rTT TAT GGG GTT TTG 
val Tyr Gly Val Leu 
130 

GGG TGT TTG ATG GGG 
Gly cys Leu Met Gly 
145 

GAT GCT GGG CAT TAT 
Sp Ala Gly Hxs Tyr 
160 

ATG GGT ATT TTT GCT 
Set Gly lie Phe Ala 

TGG AA,^ TGG AAC CAT 
Trp Lys Trp Asn His 



54 



102 



150 



190 



246 



294 



342 



390 



438 



486 



534 



582 



630 
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AAT GCA CAT CAC ATT GCC TGT AAT AGC CTT GAA TAT GAG CCT GAT TTA 
Asn Ala Ifis His lie Ala Cys Asn Ser Leu Glu Tyr Aso Pro Asp Leu 
200 205 ^ 210 

CPJ\ TAT ATA CCA TTC CTT GTT GTG TCT TCC AAG TTT TTT GGT TCA CTC 
Gin Tyr He Pro Phe Leu Val Val Ser Ser Lys Phe Phe Gly Ser Leu 
215 220 225 

ACC TCT CAT TTC TAT GAG AAA AGG TTG ACT TTT GAC TCT TTA TCA AGA 
Thr Ser His Phe Tyr Glu Lys Arg Leu Thr Phe Aso Ser Leu Ser Arg 
230 235 240 

TTC TTT GTA ACT TAT CAA CAT TGG ACA TTT TAG CCT ATT ATG TGT GCT 
Phe Phe Val Ser Tyr Gin His Trp Thr Phe Tyr Pro He Met Cys Ala 
. 245 250 255 260 

GCT AGG CTC AAT ATG TAT GTA CAA TCT CTC ATA ATG TTG TTG ACC AAG 
Ala Arg Leu Asn Met Tyr Val Gin Ser Leu He Met Leu Leu Thr Lys 

265 270 275 

AGA AAT GTG TCC TAT CGA GCT CAG GAA CTC TTG GGA TGC CTA GTG TTC 
Arg Asn Val Ser Tyr Arg Ala Gin Glu Leu Leu Glv Cys Leu Val Phe 
280 285 ^ 290 

TCG ATT TGG TAG CCG TTG CTT GTT TCT TGT TTG CCT AAT TGG GGT GAA 
Ser He Trp Tyr Pro Leu Leu Val Ser Cys Leu Pro Asn Trp Gly Glu 
295 300 305 

AGA ATT ATG TTT GTT ATT GCA ACT TTA TCA GTG ACT GGA ATG CAA CAA 
Arg He Met Phe Val He Ala Ser Leu Ser Val Thr Gly Met Gin Gin 
310 315 320 

GTT CAG TTC TCC TTG AAC CAC TTC TCT TCA AGT GTT TAT GTT GGA AAG 
Val Gin Phe Ser Leu Asn His Phe Ser Ser Ser Val Tyr Val Gly Lys 
325 330 335 340 

CCT AAA GGG AAT AAT TGG TTT GAG A.AA CA7^ AGG GAT GGG ACA CTT GAC 
Pro Lys Gly Asn Asn Trp Phe Glu Lys Gin Thr Aso Gly Thr Leu Asp 

345 35b 355 

ATT TCT TGT CCT CCT TGG ATG GAT TGG TTT CAT GGT GGA TTG CAA TTC 
He Ser Cys Pro Pro Trp Met Asp Trp Phe His Glv Gly Leu Gin Phe 
360 355 370 

CAA ATT GAG CAT CAT TTG TTT CCC .AAG ATG CCT AGA TGC AAC CTT AGG 
Gin He Glu His His Leu Phe Pro Lys Met Pro Arg Cys Asn Leu Arg 
375 380 385 

AA^A ATC TCG CCC TAG GTG ATG GAG TTA TGC .AAG AA^A CAT AAT TTG CCT 
Lys He Ser Pro Tyr Val He Glu Leu Cys Lys Lys His Asn Leu Pro 
390 395 400 

TAG AAT TAT GCA TCT TTC TCC AAG GCC AAT GAA ATG ACA CTC AGA ACA 

Tyr Asn Tyr Ala Ser Phe Ser Lys Ala Asn Glu Men Thr Leu Arg Thr 
405 410 415 420 
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OCA TTC CAG GCT AGG GAT ATA ACG AAC, CCG CTC CCG 
^T. a" Lu Gin Ala A.g Asp He 

]>eA\ Arg n::^ ^ 

CGT CTT CAC AGT GAT GGT T AAAATTACGG 
AAG AAT TTG GTA TGG GAA GGT CTT 
Lvs Asn Leu Vai irp 445 

.0 AGATTATGTA TGTCCTATCT TTGTGTCTTG TCTTGGTTCT 
„AGTTCATG TAATAATTTC G Tl ^^^^^^^^^^ ^^^^^^^ 

.CTTGTTOGA GTGATTGCAA GT G ^^^^^^^^^ ^^^^^^^^^ 

:::rTrT ggaatotac t^taccac gtgg™ OTTGAAGGTC 

::rG:T —TT TGTTTAAATG GTTATGTCAT GTTATTT 
TMFORMATION FOR SEQ ID «0:2: 

?sro;/Gv^"°u.rea? 

MOLECULE TYPE: protein 

^ TIP Ser lie Gin Gly Ala Tyr 

mx/ ASP Leu Trp He Ser iie 
His ASP Lys pro Gly A^p ^5 

r.^^, nv ser Phe Pro Leu 
..p val se. ASP T.P val Lv= asp .Us Pro 0 V G V 
.e. L^e^,AXa GIV Gin GluValT.r Asp - 
. se. T« T. L.S Asn L. Asp L.s T.. 01. T. T. 

, , val se. Glu val Ser Lys ASP Tyr Ar, Lys Leu 
Lys ASP Tyr Ser val Se , 

■ . ^ T v.c; GlY His lie 

, ,er Lys Met Gly Leu Tyr Asp Lys Lys G^y 
val Phe Gl.. Pne Ser Ly. 

, .„r Leu cys Phe Ue Ala Met Leu P.e Ala Met 

Met Phe Ala Thr Leu uy 
115 

Gly val Leu Phe Cys Glu 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 31 . . 603 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

TTAGCCTTTA CTCTATAGTT TTAGATAGAC ATG GCG AAT GTG GAT CGT GAT CGG 5 4 

Met Ala Asn Val Asp Arg Asp Arg 

140 

CGT GTG CAT GTA GAC CGT ACT GAC AAA CGT GTT CAT CAG CCA AAC TAC 102 
Arg Val His Val Asp Arg Thr Asp Lys Arg Val His Gin Pro Asn Tyr 
145 150 155 

GAA GAT GAT GTC GGT TTT GGT GGC TAT GGC GGT TA.T GGT GCT GOT TCT 150 
Glu Asp Asp Val Gly Phe Gly Gly Tyr Gly Gly Tvr Gly Ala Gly Ser 
160 165 170 175 

GAT TAT AAG AGT CGC GGC CCC TCC ACT AAC CAJ\ ATC TTG GCA CTT ATA 19 8 

Asp Tyr Lys Ser Arg Gly Pro Ser Thr Asn Gin lie Leu Ala Leu lie 

180 185 190 

GCA GGA GTT CCC ATT GGT GGC ACA CTG CTA ACC CTA GCT GGA CTC ACT 24 6 

Ala Gly Val Pro lie Gly Gly Thr Leu Leu Thr Leu Ala Gly Leu Thr 
195 200 205 

CTA GCC GGT TCG GTG ATC GGC TTG CTA GTC TCC ATA CCC CTC TTC CTC 294 
Leu Ala Gly Ser Val lie Gly Leu Leu Val Ser lie Pro Leu Phe Leu 
210 215 220 

CTC TTC AGT CCG GTG ATA GTC CCG GCG GCT CTC ACT ATT GGG CTT GCT" 3 42 

Leu Phe Ser Pro Val lie Val Pro Ala Ala Leu Thr lie Gly Leu Ala 
225 230 235 

GTG ACG GGA ATC TTG GCT TCT GGT TTG TTT GGG TTG ACG GGT CTG AGC 390 
Val Thr Gly lie Leu Ala Ser Gly Leu Phe Gly Leu Thr Gly Leu Ser 
240 245 250 255 

TCG GTC TCG TGG GTC CTC AAC TAC CTC CGT GGG ACG AGT GAT ACA GTG 43 8 

Ser Val Ser Trp Val Leu Asn Tyr Leu Arg Gly Thr Ser Asp Thr Val 

260 265 270 
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To OCT cor CO .CT o^c ^.o .CT ^cc cat gac cco 

SI 5il Sfu SI «a 31U Thr Glu P.e Me. 

. n .cc ... - ™— — = — ™ 
Sy LVS Sa Cly ser 

. ^..OTACT .TAC.™. C.T.CC.C.T .^O^T.T 

CTTTGTCTAT ATATGTGTTC ^ ppaCAAATCT CATACTATTT 

..T,. T^rTTTTCTTT TTTGAGATAA CCAGAAAlUi 
;VATAAGAAAT GAAATAAATA TGTTTTCTT 

TCTAAAAAAA AAAAAA-AAAA A 
(2) INFORMATION FOR ' SEQ ID NO : 4 : 

IB) TYPE: amino acid 
(D) TOPOLOGY: lineal: 

(in MOLECULE TYPE: protein 

,„e. Ala .sn val Asp «a -p Val H.s val Asp 

v., >US 0.. AS. O.. ASP ASP va. 0. P.e 

... 0. Z - - To - - 

... AS. Z A. A.a Va. 

rw ser Val He Gly Leu 
r^T,, T PU Thr Leu Ala Gly ^ei. v 
,.eu Leu Thr Leu Ala Gly Leu ,5 

,:: .0 .a ... .^r- 
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7vla Ala Leu Thr lie Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 

100 105 110 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 

Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asd Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 155 160 

Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 175 

Phe Met Thr Glu Thr Mis Glu Pro Gly Lys Ala Arg Arg Gly Ser 
180 185 190 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(j. ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi. ) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
15 10 15 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Glv Gly 
2 0 2 5 3 0 ** 

Tlir Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 45 

Thr Asn Gin lie Leu Ala Leu lie Ala Gly Val- Pro lie Gly Gly Thr 
50 55 60 

Leu lie Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val lie Gly Leu 
65 70 75 ' 80 

Leu Val Ser lie Pro Leu Phe "Leu lie Phe Ser Pro Val lie Val Pro 

05 90 95 

Ala Ala Leu Thr lie Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 
100 105 110 

Le\.i Plie Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 
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Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 

130 •'-^^ 
Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 

Met Gly Gin Tyr val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 X / u 

Phe Met Thr. Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 



INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino aeids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 

1 5 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 25 

Thr Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 

3 5 

Thr Asn Gin He Leu Ala Leu He Ala Gly Val Pro tie Gly Gly Thr 
50 55 60 

Leu lie Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val He Gly Leu 
65 "^O 

Leu val ser He Pro Leu Phe Leu He Phe Ser Pro Val He V^l Pro 

Ala Ala Leu Thr He Gly Leu Ala Val Thr Gly He Leu Ala Ser Gly 

100 -^^^ 
,.eu Phe Gly Leu Thr Gly Leu Ser Ser val Ser Trp Val Leu Asn Tyr 

115 ^20 

Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 



145 
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Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

170 175 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Pro 
180 185 195 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDMESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Gin Leu Pro 

Pro Trp Ala ser Asp Thr Val Pro Glu Gin Val Asp Tyr Ala Lys Arg 

25 30 

Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu Met 

40 45 

Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu Phe 

55 50 

Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
" 70 7 5 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ala Asp Thr Ala Arg Gly Thr His His Asp He He Gly Arg Asp 

^ 10 15 . 

Gin Tyr Pro Met Met Gly Arg Asp Arg Asp Gin Tyr Gin Met Ser Gly 

20 25 ' ^ 



Arg Gly Ser Asp Tyr Ser Lys Ser Arg Gin He Ala Lys Ala Ala Thr 
J5 40 



30 

s 

45 
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Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Ser Leu Thr Leu 
50 55 60 

Val Gly Thr Val Leu Ala Leu Thr Val Ala Thr Pro Leu Leu Val Leu 
65 70 75 80 

Phe Ser Pro lie Leu Val Pro Ala Leu lie Thr Val Ala Leu Leu lie 

85 90 95 

Thr Gly Phe Leu Ser Ser Gly Gly Phe Gly lie Ala Ala lie Thr Val 
100 105 110 

Phe Ser Trp lie Tyr Lys Tyr Ala Thr Gly Glu His Pro Gin Gly Ser 
115 120 125 

Asp Lys Leu Asp Ser Ala Arg-Met Lys Leu Gly Ser Lys Ala Gin Asp 
130 135 140 

Leu Lys Asp Arg Ala Gin Tyr Tyr Gly Gin Gin His Thr Gly Gly Glu 
145 150 155 150 

His Asp Arg Asp Arg Thr Arg Gly Gly Gin His Thr Thr 

165 170 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Ala Asp Gin Thr Arg Thr His His Glu Met lie Ser Arg Asp Ser 
1 5 10 * 15 

Thr Gin Glu Ala His Pro Lys Ala Arg Gin Trp Val Lys Ala Ala Thr 
20 25 30 

Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Gin Leu Thr Leu 
35 40 45 

Ala Gly Thr Val lie Ala Leu Thr Val Ala Thr Pro Leu Leu Val lie 
50 * 55 60 

Phe Ser Pro Val Leu Val Pro Ala Val Val Thr Val Ala Leu lie lie 
65 70 75 80 

Thr Gly Phe Leu Ala Ser Gly Gly Phe Gly lie Ala Ala lie Thr Ala 

85 90 95 
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Phe Ser Trp Leu Tyr Arg His Trp Thr Gly Ser Gly Ser Asp Lys Tie 

105 110 

Glu Trp Ala Arg Met Lys Val.Gly Ser Arg Val Gin Asp Thr Lys Tyr 
lis 120 125 

Gly Gin His Trp He Gly Val Gin His Gin Gin Val Ser 
130 135 140 

INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Asp Thr His Arg Val Asp Arg Thr Asp Arg His Phe Gin Phe 
^5 10 

Gin ser Pro Tyr Glu Gly Gly Arg Gly Gin Gly Gin Tyr Glu Gly Asp 
20 25 30 

Arg Gly Tyr Gly Gly Gly Gly Tyr Lys Ser Met Met Pro Glu Ser Glv 
35 40 45 

Pro Ser Ser Thr Gin Val Leu Ser Leu Leu He Gly Val Pro Val Val 

55 60 

Gly Ser Leu He Ala Leu Ala Gly Leu Leu Leu Ala Gly Ser Val He 

'^^ 75 80 

Gly Leu Met Val Ala Leu Pro Leu Phe Leu He Phe Ser Pro Val He 

85 90 95 

Val Pro Ala Gly Leu Thr He Gly Leu Ala Met Thr Gly Phe Leu Ala 
100 105 110 

Ser Gly Met Phe Gly Leu Thr Gly Leu Ser Ser He Ser Trp Val Met 
115 120 125 

Asn Tyr Leu Arg Gly Thr Ala Arg Thr Val Pro Glu Gin Leu Glu Tvr 
130 135 140 

Ala Lys Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Gin Lys Gly 
145 150 155 ^ 16^ 

Lys Glu Met Gly Gin His Val Gin Asn Lys Ala Gin Asp Val Lys Gin 

165 170 175 
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eln Gly Gly Thr Thr Ma Ma 
INTORMMIOH FOR SEQ ID 

\l] TOPOLOGY: l^ear 
jU) MOLECULE TYPE: protein 

rr:r;:"-" ~ - - - - - 

1 on . Gin Gly Gin Tyr Glu Gly asp 

Ser pro Tyr Glu Gly Gly Ar. Gly Oln 

. .vs Ser Met Met pro Glu Ser Gly 
Giy Tyr Gly oly Gly Gly Tyr ^vs 

35 ^. Val pro Val Val 

pro Ser Ser Thr 

.la Gly Leu Leu He Ma Gly Ser Val lie 
Gly ser Leu He Ma -u Ma Gly 

. .eu pro Leu Phe Leu He P^e Ser Pro Val 
Oiy Leu Met Val Ma Leu Pro 

Met Thr Gly Phe Leu Ala 
n T..n Thr He Gly Leu Ma Met in 
„al pro Ala Ala Leu Thr 

,er ser He Ser Trp val Met 
T^Ho rlv Leu Thr Gly Leu Ser Ser 
qer Gly Met Phe Gly i-e 

. , j.ra Gly Thr Arg Arg Thr Val p 
Asn Tyr Leu Arg Giy 

val Gly Tyr Ma Gly Gin Lys Gly 
r c ;.ra Arg Met Ala Asp Ala Val Giy 
Ala Lys Arg 

.3 Glu Met Giy Gin His Val Gin Asn Lys 

Tyr ASP He Ser Lys Pro His Asp Thr Thr Thr Lys 
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Gln Gly Arg Thr Thr Ala Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 67 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GAGCTCGATC ACACAAAGAA AACGTCAAAT ' GGATC ATACT GGGCCCATTT TGCAGACCAA 60 

GAGAAAGTGA GAGAGAGTTG TCCTCTCGTT ATCAAGTAAC AGTAGACCAC CACTAAACCG 120 

CCAATAGCTT ATAATCAAAA TAGAAAGGTC TAATAACAGA AACAAATGAA AAAGCCTTGT 180 

TCCATGGACT GCCTACCCGA ATTGATTGAT TCGACTAGTT TTTCTTCTTC TTTGATTAAG 240 

ACCTCCGTAA GAAAAATGGT ACTACTAAAG CCACTCGCTA CCAAAACTAA ACCATTCCAG 3 00 

ACTGTAACTG GACCAATATT TCTAAACTGT AACCAGATCT CAAACATATA AACTAATTAA 3 60 

GAACTATAAC CATTAACCGT AAAAATAAAT TTACTACAGT AAAAAATTAT ACTAATTTCA 420 

GCTATGATGG AATTTCAGCT CTTAAGAGTT GTGGAAATCA AGTAAACCTA AAATCCTAAT 480 

AATATTCTTC ATCCTTATTT TTGTTTCACA TGCATGCTGT CCAATCTGTT ATTAGCATTT 540 

GAAAGCCTAA AATTCTATAT ACAGTACAAT AAATCTAATT 'aaTTTTCATT ACTAATAAAA 600 

TGCTTCATAT ATACTCTTGT ATTTATAAAT CATCCGTTAT CGTTACTATA CCTTTATACA 660 

TCATCCTACA TTCATACCTA AGCTAGCAAA GCAAACTACT AAAAGGGTCG TCAACGCAAG 720 

TTATTTGCTA GTTGGTGCAT ACTACACACG GCTACGGCAA CATTAAGTAA CACATTAAGA 7 80 

GGTGTTTTCT TAATGTAGTA TGGTAATTAT ATTTATTTCA AAACTTGGAT TAGATATAAA 840 

GGTACAGGTA GATGAAAAAT ATTTGGTTAG CGGGTTGAGA TTAAGCGGAT ATAGGAGGCA 900 

TATATACAGC TGTGAGAAGA AGAGGGATAA ATACAAAAAG GGAAGGATGT TTTTGCCGAC 96 0 

AGAGAAAGGT AGATTAAGTA GGCATCGAGA GGAGAGCAAT TGTAAAATGG ATGATTTGTT 102 0 

TGGTTTTGTA CGGTGGAGAG AAGAACGAAA AGATGATCAG GTAAAAAATG AAACTTGGAA 1080 

ATCATGCAAA GCCACACCTC TCCCTTCAAC ACAGTCTTAC GTGTCGTCTT CTCTTCACTC 1140 

CATATCTCCT TTTTATTACC AAGAAATATA TGTCAATCCC ATTTATATGT ACGTTCTCTT 1200 
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(2) INFORMATION FOR SEQ ID NO: 13: 
,n SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 

B) TYPE: nucleic ^^id 

C) STRANDEDNESS : Single 
(D) TOPOLOGY: linear 

MOLECULE TYPE: DNA (genomic) 
(^i) SEQUENCE DESCRIPTION: SEQ ID NO : 13 : 
GTAATACGAC TCACTATAGG GC 

(2) INFORMATION FOR SEQ ID NO: 14: 

i ^ > SEQUENCE CHARACTERISTICS: 
LENGTH: 32 base pairs 
TYPE: nucleic acid 
C) STRANDEDNESS: Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGGGATCCTA TACTAAAACT ATAGAGTAAA GG 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(,i) SEQUENCE DESCRIPTION: SEQ ID N0:15: 
Trp lie Gly His Asp Ala Gly His 



1 ^ 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Asn Val Gly His Asp Ala Asn His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Val Leu Gly His Asp Cys Gly His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Val lie Ala His Glu Cys Gly His 
1 5 



12) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

SEQUENCE description: SEQ IDN0:19: 
val lie Gly His asp cys Ma His 
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) INFORMATION FOR SEQ ID NO: 20: 
,^^ qEOUENCE CHARACTERISTICS: 
A) LENGTH: 8 amino acids 
,o\ TYPE: amino acid 
C STRANDEDNESS : Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

SEQUENCE description: SEQ IDNO:20: 

val val Gly His Asp Cys Gly His 
1 ^ 

2) INFORMATION FOR SEQ ID NO: 21: 
, • ^ qEOUENCE CHARACTERISTICS : 
^"JSV lENGTH: 5 amino acids 
/ri\ TYPE: amino acia 
C STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(,i) SEQUENCE description: SEQ ID N0:21: 
His Asn Ala His His 



1 ^ 



,2) INFORMATION FOR SEQ ID NO: 22: 
,K^ c;e0UENCE CHARACTERISTICS: 
'^^ A) LENGTH: 6 amino aci.ds 
B TYPE: amino acid 
\r\ STRANDEDNESS: Single 
(D) TOPOLOGY: linear 

,ii) MOLECULE TYPE: protein 
,,i) SEQUENCE description: SEQ IDNO:22 
His Asn Tyr Leu His His 



1 ^ 



^^iO:<VW5_fle«461AlJ-? 



_P/ 

wo 98/4S461 




x l 

PCT/US98/07179 

-74- 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 

His Arg Thr His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

His Arg. Arg His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

His Asp Arg His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
His ASP Gin His His 



1 5 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
His ASP His His His 



1 5 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amxno acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28 



His Asn His His His 
1 ^ 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:29 



Phe Gin He Glu His His 
1 5 
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(2) INFORMATION FOR SEQ ID NO : 3 0 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 

His Gin Val Thr His His 
1 5 



(2) INFORMATION FOR SEQ ID NO : 3 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 1 : 

His Val lie His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 2 : 

His Val Ala His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(U) MOLECULE TYPE: protein 
\^,, SEQUENCE description: SEQ IDN0:33: 
His lie pro His His 

(2) INFORMATION FOR SEQ ID NO: 34: 
,-s qrOUENCE CHARACTERISTICS : 
^^^S) LENGTH: 5 amino acxds 
B TYPE: amino acid 
C STRANDEDNESS: Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

SEQUENCE DESCRIPTION: SEQ IDN0:34: 

His Val pro His His 
(2) INFORMATION FOR SEQ ID NO: 35:. 

SfANDrDS^nls: foible 
%\ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(.i) SEQUENCE ^^^"Z^^^^^ --^^-^ 

..OCTCCATC ACACAAAGAA AACCTC- ^^^^^^^^^^ ^^^^^^^ 

CAOAAACTCA OACACACTTC TCCT^CCTT ^^^^^^ ^^^^^^^^ 
eCAATACCTT ATAATCAA^ T^^^ 

TCCATCCACT GCCTACCCO ATT-T^ ^^^^^^ ^^^^^^^^^^ 

.CCTCCCTAA CAAAAATCOT ACT^^- ^^^^^^^^ ^^^^^^^^ 

— CACCAATATT T^C ^^^^^ 
O^CTATAAC CATTAACCGT AA^^- ^^^^^^^^ ^,,,,^T 

— — - C^^^- eCAATCTGTT ATTAOCATTT 
^TATTCTTC ATCCTTATTT TTGTT^^^^^ ^^^^^^ ^,,TTCATT ACTAATAAAA 
GAAAGCCTAA AATTCTATAT ACAGTACAAT 



60 
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360 
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600 
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TGCTTCATAT ATACTCTTGT ATTTATAAAT CATCCGTTAT CGTTACTATA CCTTTATACA 660 

TCATCCTACA TTCATACCTA AGCTAGCAAA GCAAACTACT AAAAGGGTCG TCAACGCAAG 720 

TTATTTGCTA GTTGGTGCAT ACTACACACG GCTACGGCAA CATTAAGTAA CACATTAAGA 780 

GGTGTTTTCT TAATGTAGTA TGGTAATTAT ATTTATTTCA AAACTTGGAT TAGATATAAA 840 

GGTACAGGTA GATGAAAAAT ATTTGGTTAG CGGGTTGAGA TTAAGCGGAT ATAGGAGGCA 900 

TATATACAGC TGTGAGAAGA AGAGGGATAA ATACAAAAAG GGAAGGATGT TTTTGCCGAC 960 

AGAGAAAGGT AGATTAAGTA GGCATCGAGA GGAGAGCAAT TGTAAAATGG ATGATTTGTT 102 0 

TGGTTTTGTA CGGTGGAGAG AAGAACGAAA AGATGATCAG GTAAAAAATG AAACTTGGAA 108 0 

ATCATGCAAA GCCACACCTC TCCCTTCAAC ACAGTCTTAC GTGTCGTCTT CTCTTCACTC 114 0 

CATATCTCCT TTTTATTACC AAGAAATATA TGTCAATCCC ATTTATATGT ACGTTCTCTT 1200 

AGACTTATCT CTATATACCC CCTTTTAATT TGTGTGCTCT TAGCCTTTAC TCTATAGTTT 12 60 

TAGATAGACA TGGCGAATGT GGATCGTGAT CGGCGTGTGC ATGTAGACCG TACTGACAAA 13 2 0 

CGTGTTCATC AGCCAAACTA CGAAGATGAT GTCGGTTTTG GTGGCTATGG CGGTTATGGT 13 8 0 

GCTGGTTCTG ATTATAAGAG TCGCGGCCCC TCCACTAACC AAGTATTTTT GTGGTCTCTT 1440 

TAGTTTTTCT TGTGTTTTCC TATGATCACG CTCTCCAAAC TATTTGAAGA TTTTCTGTAA 1500 

ATTCATTTTA AACAGAAAGA TAAATAAAAT AGTGAAGAAC CATAGGAATC GTACGTTACG 15 60 

TTAATTATTT CCTTTTAGTT CTTAAGTCCT AATTAGGATT CCTTTAAAAG TTGCAACAAT 162 0 

CTAATTGTTC ACAAAATGAG TT^AAGTTTGA AACAGATTTt' TATACACCAC TTGCATATGT 1680 

TTATCATGGT GATGCATGCT TGTTAGATAA ACTCGATATA ATCAATACAT GCAGATCTTG 1740 

GCACTTATAG CAGGAGTCCA TTGGTGGCAC ACTGCTAACC CTAGCTGGAC TCACTCTAGC 1800 

CGGTTCGGTG ATCGGCTTGC TAGTCTCCAT ACCCCTCTTC CTCCTCTTCA GTCCGGTGAT 1860 

AGTCCCGGCG GCTCTCACTA TTGGGCTTGC TGTGACGGGA ATCTTGGCTT CTGGTTTGTT 1920 

TGGGTTGACG GGTCTGAGCT C 1941 
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What is claimed is: 

1. An isolated nucleic acid encoding an 
oleosin 5* regulatory region which directs seed- 
specific expression selected from the groups 
consisting of the nucleotide sequence set forth in SEQ 
ID NO: 12, the nucleotide sequence set forth in SEQ ID 
NO: 12 having an insertion, deletion, or substitution 
of one or more nucleotides, or a contiguous fragment 
of the nucleotide sequence set forth in SEQ ID NO: 12. 

2. An expression cassette which comprises 
the oleosin 5' regulatory region of Claim 1 operably 
linked to at least one of a nucleic acid encoding a 
heterologous gene or a nucleic acid encoding a 
sequence complementary to a native plant gene. 

3. The expression cassette of Claim 2 
wherein the heterologous gene is at least one of a 
fatty acid synthesis gene or a lipid metabolism gene. 

4. The expression cassette of Claim 3 • 
wherein the heterologous gene is selected from the 
group consisting of an acetyl -coA carboxylase gene, a 
ketoacyl synthase gene, a malonyl transacylase gene, a 
lipid desaturase gene, an acyl carrier protein (ACP) 
gene, a thioesterase gene, an acetyl transacylase 
gene, or an elongase gene. 

5. The expression cassette of Claim 4 
wherein the lipid desaturase gene is selected from the 
group consisting of a A6 - desaturase gene, a A12- 
desaturase gene, and a A15 -desaturase gene, 

6. An expression vector which comprises the 
expression cassette of any one of Claims 2-5. 
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7. A cell comprising the expression 
cassette of any one of Claims 2-5. 

8. A cell comprising the expression vector 
of Claim 6. 

9. The cell of Claim 7 wherein said cell is 
a bacterial cell or a plant cell. 

10. The cell of Claim 8 wherein said cell 
is a bacterial cell or a plant cell, 

11. A transgenic plant comprising the 
expression cassette of any one of Claims 2-5, 

12. A transgenic plant comprising the 
expression vector of Claim 6. 

13. A plant which has been regenerated from 
the plant cell of Claim 9. 

14. A plant which has been regenerated from 
the plant cell of Claim 10. 

15. The plant of Claim 12 or 13 wherein 
said plant is at least one of a sunflower, soybean, 
maize, cotton, tobacco, peanut, oil seed rape or 
Arabidopisis plant. 

16. Progeny of the plant 'of Claim 11 or 12. 

17. Seed from the plant of Claim 11 or 12. 

18. A method of producing a plant with 
increased levels of a product of a fatty acid 
synthesis gene or a lipid metabolism gene which 
comprises : 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to at least one of an 
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isolated nucleic acid coding for a fatty acid 
synthesis gene or a lipid metabolism gene; and 

(b) regenerating a plant with increased 
levels of the product of said fatty acid synthesis or 
said lipid metabolism gene from said plant cell. 

19. A method of producing a plant with 
increased levels of gamma linolenic acid (GLA) content 
which comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a A6 - desaturase gene; 
and 

(b) regenerating a plant with increased 
levels of GLA from said plant cell. 

20. The method of Claim 19 wherein said A6- 
desaturase gene is at least one of a cyanobacterial 
A6-desaturase gene or a Borage A6 - desaturase gene. 

21. The method of any one of Claims 18-20 
wherein said plant is a sunflower, soybean, maize, 
tobacco, cotton, peanut, oil seed rape or Arabidopsis 
plant . 

22. The method of Claim 18 wherein said 
fatty acid synthesis gene or said lipid metabolism 
gene is at least one of a lipid desaturase, an acyl 
carrier protein (ACP) gene, a thioesterase gene an 
elongase gene, an acetyl transacylase gene, an acetyl - 
coA carboxylase gene, a ketoacyl synthase gene, or a 
malonyl transacylase gene. 

23. A method of inducing production of at 
least one of gamma linolenic acid (GLA) or 
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octadecatetraeonic acid (OTA) in a plant deficient or 
lacking in GLA which comprises transforming said plant 
with an expression vector comprising an the isolated 
nucleic acid of Claim 1 operably linked to a A6 - 
desaturase gene and regenerating a plant with 
increased levels of at least one of GLA or OTA. 

24. A method of decreasing production of a 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
complementary to a fatty acid synthesis or lipid 
metabolism gene; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 
metabolism gene. 

25. A method of cosuppressing a native 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises: 

(a) transforming a cell of^ the plant with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
encoding a fatty acid synthesis or lipid metabolism 
gene native to the plant; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 
metabolism gene. 
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AN OLEOSIN 5 • REGULATORY REGION FOR THE 
MODIFICATION OF PLANT SEED LIPID COMPOSITION 

BACKGROUND OF THE INVENTION 

Seed oil content has traditionally been 
modified by plant breeding. The use of recombinant 
DNA technology to alter seed oil composition can 
accelerate this process and in some cases alter seed 
oils in a way that cannot be accomplished by breeding 
alone. The oil composition of Brassica has been 
significantly altered by modifying the expression of a 
number of lipid metabolism genes. Such manipulations 
of seed oil composition have focused on altering the 
proportion of endogenous component fatty acids. For 
example, antisense repression of the A12 - desaturase 
gene in transgenic rapeseed has resulted in an 
increase in oleic acid of up to 83%. Topfer et al . 
1995 Science 26'S : 681 - 686 . 

There have been some successful attempts at 
modifying the composition of seed oil in transgenic 
plants by introducing new genes that allow the 
production of a fatty acid that the host plants were 
not previously capable of synthesizing. Van de Loo, 
et al . (1995 Proc . Natl. Acad. Sci USA 92 : 61 43 - 61 47 ) 
have been able to introduce a A12 -hydroxylase gene 
into transgenic tobacco, resulting in the introduction 
of a novel fatty acid, ricinoleic acid, into its seed 
oil. The reported accumulation was modest from plants 
carrying constructs in which transcription of the 
hydroxylase gene was under the control of the 
cauliflower mosaic virus (CaMV) 35S promoter. 
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Similarly, tobacco plants have been engineered to 
produce low levels of petroselinic acid by expression, 
of an acyl-ACP desaturase from coriander (Gaboon et 
al. 1992 Proc. Natl. Acad. Sci USA 85:11184-11188). 

The long chain fatty acids {C18 and larger) , 
have significant economic value both as nutritionally 
and medically important foods and as industrial 
commodities (Ohlrogge , J . B . 1994 Plant Physiol. 
104:821-826). Linoleic (18:2 A9, 12) and a-linolenic 
acid (18:3 A9,12,15) are essential fatty acids found 
in many seed oils. The levels of these fatty-acids 
have been manipulated in oil seed crops through 
breeding and biotechnology (Ohlrogge, et al . 1991 
Biochlm. Biophys. Acta 1082:1-26; Topfer et al . 1995 
Science 268 :681- 686) . Additionally, the production of 
novel fatty acids in seed oils can be of considerable 
use in both human health and industrial applications. 

Consumption of plant oils rich in y- 
linolenic acid (GLA) (18:3 A6,9,12) is thought to 
alleviate hypercholesterolemia and other related 
clinical disorders which correlate with susceptibility 
to coronary heart disease (Brenner R.R. 1976 Adv. Exp. 
Med. Biol. 53:85-101). The therapeutic benefits of 
dietary GLA may result from its role as a precursor to 
prostaglandin synthesis (Weete, J.D. 1980 in Lipid 
Biochemistry of Fungi and Other Organisms, eds . Plenum 
Press, New York, pp. 59-62). Linoleic acid{18:2) (LA) 
is transformed into gamma linolenic acid (18:3) (GLA) 
by the enzyme A6 - desaturase . 
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Few seed oils contain GLA despite high 
contents of the precursor linoleic acid. This is due 
to the absence of A6 - desaturase activity in most 
plants. For example, only borage (Borago 
officinalis) , evening primrose (Oenothera biennis) . 
and currants (Rihes nigrum) produce appreciable 
amounts of linolenic acid. Of these three species, 
only Oenothera and Borage are cultivated as a 
commercial source for GLA. It" would be beneficial if 
agronomic seed oils could be engineered to produce GLA 
in significant quantities by introducing a 
heterologous A6 -desaturase gene. It would also be 
beneficial if other expression products associated 
with fatty acid synthesis and lipid metabolism could 
be produced in plants at high enough levels so that 
commercial production of a particular expression 
product becomes feasible. 

As disclosed, in U.S. Patent No. 5,552.306, a 
cyanobacterial - desaturase gene has been recently 
isolated. Expression of this cyanobacterial gene in 
transgenic tobacco resulted in significant but low 
level GLA accumulation. (Reddy et al . 1996 Nature 
Biotech. 14:639-642). Applicant's copending U.S. 
Application Serial No. 08,366,779, discloses a A6 - 
desaturase gene isolated from the plant Borago 
officinalis and its expression in tobacco under the 
control of the CaMV 35S promoter. Such expression 
resulted in significant but low level GLA and 
octadecatetraenoic acid (ODTA or OTA) accumulation in 
seeds. Thus, a need exists for a promoter which 
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In another aspect of the present invention, 
a method is provided for producing a plant with 
increased levels of a product of a fatty acid 
synthesis or lipid metabolism gene. 

In particular, there is provided a method 
for producing a plant with increased levels of a fatty 
acid synthesis or lipid metabolism gene by 
transforming a plant with the subject expression 
cassettes and expression' vectors which comprise an 
oleosin 5* regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene. 

In another aspect of the present invention, 
there is provided a method for cosuppressing a native 
fatty acid synthesis or lipid metabolism gene by 
transforming a plant with the subject expression 
cassettes and expression vectors which comprise an 
oleosin 5' regulatory region and a coding sequence for 
a fatty acid synthesis or lipid metabolism gene, 

A further aspect of this invention provides 
a method of decreasing production of a native plant 
gene such as . a fatty acid synthesis 'gene or a lipid 
metabolism gene by transforming a plant with an 
expression vector comprising a oleosin 5' regulatory 
region operably linked to a nucleic acid sequence 
complementary to a native plant gene. 

Also provided are methods of modulating the 
levels of a heterologous gene such as a. fatty acid 
synthesis or lipid metabolism gene by transforming a 
plant with the subject expression cassettes and 
expression vectors . 
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BRIEF DEflC^RIPTION O P THR DPawTK^ ri.c 

Fig. 1 depicts the nucleotide and 
corresponding amino acid sequence of the borage A6 - 
desaturase gene (SEQ ID NO:l) . The cytochrome b5 
heme-binding motif is boxed and the putative metal 
binding, histidine rich motifs (HRMs) are underlined. 
The motifs recognized by the primers (PGR analysis) 
are underlined with dotted lines, i.e. tgg aaa tgg aac 
cat aa; and gag cat cat ttg ttt cc. 

Fig. 2 is a dendrogram showing similarity of 
the borage A6 - desaturase to other membrane -bound 
desaturases. The amino acid sequence of the borage A6- 
desaturase was compared to other known desaturases 
using Gene Works (IntelliGenetics) . Numerical values 
correlate to relative phylogenetic distances between 
subgroups compared. 

Fig. 3A provides a gas liquid chromatography 
profile of the fatty acid methyl esters (FAMES) 
derived from leaf tissue of a wild type tobacco 
'Xanthi' . 

Fig. 3B provides a gas liquid chromatography 
profile of the FAMES derived from leaf tissue of a 
tobacco plant transformed with the borage A6- 
desaturase cDNA under transcriptional control of the 
CaMV 3 5S promoter (pAN2) . Peaks corresponding to 
methyl linoleate (18:2), methyl y-linolenate (18:3y), 
methyl or - linolena te (18:3a), and methyl 
octadecatetraenoate (18:4) are indicated. 
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Fig. 4 is the nucleotide sequence and 
corresponding amino acid sequence of the pleosin AtS21 
cDNA (SEQ ID NO : 3 ) . 

Fig. 5 is an acidic-base map of the 
predicted AtS21 protein generated by DNA Strider 1.2. 

Fig. 6 is a Kyte-Doolittle plot of the 
predicted AtS21 protein generated by DNA Strider 1.2. 

Fig. 7 is a sequence alignment of oleosins 
isolated from Arahidopsis: Oleosin sequences 
published or deposited in EMBL, BCM, NCBI databases 
were aligned to each other using GeneWorks® 2.3. 
Identical residues are boxed with rectangles. The 
seven sequences fall into three groups. The first 
group includes AtS21 (SEQ ID N0:5), X91918 (SEQ ID 
NO:6) and Z29859 (SEQ ID NO : 7 ) . The second group 
includes X62352 (SEQ ID NO:8) and Atol3 (SEQ ID NO : 9 ) . 
The third group includes X91956 (SEQ ID NO: 10) and 
L40954 (SEQ ID N0:11). Differences in amino acid 
residues within the same group are indicated by 
shadows. Ato2/Z54164 is identical to AtS21. Atol3 
sequence (Accession No. Z541654 in EMBL database) is 
actually not disclosed in the EMBL database. The 
Z54165 Accession number designates the same sequence 
as Z54164 which is Atol2 . 

Fig. 8A is a Northern analysis of the AtS21 
gene. An RNA gel blot containing ten micrograms of 
total RNA extracted from Arabidopsis flowers (F) , 
leaves (L) , roots (R) , developing seeds (Se) , and 
developing silique coats (Si) was hybridized with a 
probe made from the full-length AtS21 cDNA. 
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Fig. 8B is a Southern analysis of the AtS21 
gene. A DNA gel blot containing ten micrograms of 
genomic DNA digested with BamHI (B) , EcoRI (E) , 
Hindlll (H) , sad (S) , and Xbal (X) was hybridized 
with a probe made from the full length AtS21 cDNA. 

Fig. 9 is the nucleotide sequence of the 
Sad fragment of AtS21 genomic DNA (SEQ ID NO:12). 
The promoter and intron sequences are in uppercase. 
The fragments corresponding to AtS21 cDNA sequence are 
in lower case. The first ATG codon and a putative 
TATA box are shadowed. The sequence complementary to 
21P primer for PGR amplification is boxed. A putative 
abscisic acid response element (ABRE) and two 14 bp 
repeats are underlined. 

Fig. 10 is a map of AtS21 promoter/GUS 

construct (pAN5) . 

Fig. IIA depicts AtS21/GUS gene expression 
in Arabidopsis bolt and leaves. 

Fig. IIB depicts AtS21 GUS gene expression 
in Arabidopsis siliques. 

Fig. lie depicts AtS21 GUS gene expression 
in Arabidopsis developing seeds. 

Figs. IID through llJ depict AtS21 GUS gene 
expression in Arabidopsis developing embryos. 

Fig. IIK depicts AtS21/GUS gene expression 
in Arabidopsis root and root hairs of a young 
seedling. 

Fig. IIL. depicts AtS21/GUS gene expression 
in Arabidopsis cotyledons and the shoot apex of a five 
day seedling. 
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Figs. IIM and UN depict AtS21/GUS gene 
expression in Arabidopsis cotyledons and the shoot 
apex of 5-15 day seedlings. 

Fig. 12A depicts AtS21/GUS gene expression 
in tobacco embryos and endosperm. 

Fig. 12B depicts AtS21/GUS gene expression 
in germinating tobacco seeds. 

Fig. 12C depicts AtS21/GUS gene expression 
in a 5 day old tobacco seedling. 

Fig. 12D depicts AtS21/GUS gene expression 
in 5-15 day old tobacco seedlings. 

Fig. 13A is a Northern analysis showing 
AtS21 mRNA levels in developing wild- type Arabidopsis 
seedlings. Lane 1 was loaded with RNA from developing 
seeds, lane 2 was loaded with RNA from seeds imbibed 
for 24-48 hours, lane 3: 3 day seedlings; lane 4: 4 
day seedlings; lane 5: 5 day seedlings; lane 6: 6 day 
seedlings; lane 7; 9 day seedlings; lane 8: 12 day 
seedlings. Probe was labeled AtS21 cDNA. Exposure 
was for one hour at -80*^0. 

Fig. 13B is the same blot as Fig. 13A only 
exposure was for 24 hours at -80°C. 

Fig. 130 is the same blot depicted in Figs. 
13A and 13B after stripping and hybridization with an 
Arabidopsis tubulin gene probe. The small band in 
each of lanes 1 and 2 is the remnant of the previous 
AtS21 probe. Exposure was for 48 hours at -80°C, 

Fig. 14 is a graph comparing GUS activities 
expressed by the AtS21 and 35S promoters. GUS 
activities expressed by the AtS21 promoter in 
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developing Arahidopsis seeds and leaf are plotted side 
by side with those expressed by the 3 5S promoter. The 
GUS activities expressed by the AtS21 promoter in 
tobacco dry seed and leaf are plotted on the right 
side of the figure. GUS activity in tobacco leaf is 
so low that no column appears. "G-H" denotes globular 
to heart stage; "H-T" denotes heart to torpedo stage; 
"T-C" denotes torpedo to cotyledon stage; "Early C" 
denotes early cotyledon;- "Late C" denotes late 
cotyledon. The standard deviations are listed in 
Table 2. 

Fig. 15A is an RNA gel blot analysis carried 
out on 5 ug samples of RNA isolated from borage leaf, 
root, and 12 dpp embryo tissue, using labeled borage 
A6 - desaturase cDNA as a hybridization probe. 

Fig. 15B depicts a graph corresponding to 
the Northern analysis results for the experiment shown 
in Fig. 15A. 

Fig. 16A is a graph showing relative legumin 
RNA accumulation in developing borage embryos based on 
results of Northern blot. 

Fig. 16B is a graph showing relative 
oleosin RNA accumulation in developing borage embryos 
based on results of Northern blot. 

Fig. 16C is a graph showing relative A6- 
desaturase RNA accumulation in developing borage 
embryos based on results of Northern blot. 

Fig. 17 is a PGR, analysis showing the 
presence of the borage delta 6-desaturase gene in 
transformed plants of oilseed rape. Lanes 1, 3 and 4 
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were loaded with PGR reactions performed with DNA from 
plants transformed with the borage delta 6-desaturase 
gene linked to the oleosin 5' regulatory region; lane 
2: DNA from plant transformed with the borage delta 6- 
desaturase gene linked to the albumin 5' regulatory 
region; lanes 5 and 6: DNA from non - transformed 
plants; lane 7: molecular weight marker (1 kb ladder, 
Gibco BRL) ; lane 8: PGR without added template DNA; 
lane 9: control with DNA irom Agrohacterium 
tumefaciens EHA 105 containing the plasmid pAN3 (i.e. 
the borage delta6 -desaturase gene linked to the 
oleosin 5 ' regulatory region) . 

nKTATLED DESCRIPTTON OF TWF. TNVENTION 

The present invention provides isolated 
nucleic acids encoding 5' regulatory regions from an 
Arabidopsis oleosin gene. In accordance with the 
present invention, the subject 5- regulatory regions, 
when operably linked to either a coding sequence of a 
heterologous gene or a sequence complementary to a 
native plant gene, direct expression of the coding 
sequence or complementary sequence in a plant seed. 
The oleosin 5' regulatory regions of the present 
invention are useful in the construction of an 
expression cassette which comprises in the 5' to 3 ' 
direction, a subject oleosin 5' regulatory region, a 
heterologous gene or sequence complementary to a 
native plant gene under control of the regulatory 
region and a 3' termination sequence. Such an 
expression cassette can be incorporated into a variety 
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of autonomously replicating vectors in order to 
construct an expression vector. 

It has been surprisingly found that plants 
transformed with the expression vectors of the present 
invention produce levels of GLA approaching the level 
found in those few plant species which naturally 
produce GLA such as evening primrose (Oenothera 
biennis) , 

As used herein, the term "cassette" refers 
to a nucleotide sequence capable of expressing a 
particular gene if said gene is inserted so as to be 
operably linked to one or more regulatory regions 
present in the nucleotide sequence. Thus, for 
example, the expression cassette may comprise a 
heterologous coding sequence which is desired to be 
expressed in a plant seed. The expression cassettes 
and expression vectors of the present invention are 
therefore useful for directing seed- specif ic 
expression of any number of heterologous genes. The. 
term "seed- specif ic expression" as used herein, refers 
to expression in various portions of a plant seed such 
as the endosperm and embryo. 

An isolated nucleic acid encoding a 5* 
regulatory region from an oleosin gene can be provided 
as follows. Oleosin recombinant genomic clones are 
isolated by screening a plant genomic DNA library with 
a cDNA (or a portion thereof) representing oleosin 
mRNA. A number of different oleosin cDNAs have been 
isolated. The methods used to isolate such cDNAs as 
well as the nucleotide and corresponding amino acid 
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sequences have been published in Kirik et al . 1986 
Plant Mol. Biol. 31:413-417; Zou et al . Plant Mol . 
Biol. 31:429-433; Van Rooigen et al, 1992 Plant Mol. 
Biol. 18:1117 -1119 . 

Virtual subtraction screening of a tissue 
specific library using a random primed polymerase 
chain (RP-PCR) cDNA probe is another method of 
obtaining an oleosin cDNA useful for screening a plant 
genomic DNA library. Virtual subtraction screening 
refers to a method where a cDNA library is constructed 
from a target tissue and displayed at a low density so 
that individual cDNA clones can be easily separated. 
These cDNA clones are subtract ively screened with 
driver quantities (i.e., concentrations of DNA to 
kinetically drive the hybridization reaction) of cDNA 
probes made from tissue or tissues other than the 
target tissue {i.e. driver tissue). The hybridized 
plaques represent genes that are expressed in both the 
target and the driver tissues; the unhybridized 
plaques represent genes that may be target tissue- 
specific or low abundant genes that. can not be 
detected by the driver cDNA probe. The unhybridized 
cDNAs are selected as putative target tissue - specif ic 
genes and further analyzed by one-pass sequencing and 
Northern hybridization . 

Random primed PGR (RP-PCR) involves 
synthesis of large quantities of cDNA probes from a 
trace amount of cDNA template. The method combines 
the amplification power of PGR with the representation 
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of random priming to simultaneously amplify and label 
double- stranded cDNA in a single tube reaction. 

Methods considered useful in obtaining ■ 
oleosin genomic recombinant DMA are provided in 
Sambrook et al . 1989, in Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor, NY, for 
example, or any of the myriad of laboratory manuals on 
recombinant DNA technology that are widely available. 
To determine nucleotide -sequences , a multitude of 
techniques are available and known to the ordinarily 
skilled artisan. For example, restriction fragments 
containing an oleosin regulatory region can be 
subcloned into the polylinker site of a sequencing 
vector such as pBluescript (Stratagene) . These 
pBluescript subclones can then be sequenced by the 
double -stranded dideoxy method (Chen and Seeburg, 
1985, DNA 4: 165) . 

In a preferred embodiment, the oleosin 
regulatory region comprises nucleotides 1-1267 of Fig. 
9 (SEQ ID NO:12). Modifications to the oleosin 
regulatory region as set forth in S-EQ ID NO: 12 which 
maintain the characteristic property of directing 
seed- specif ic expression, are within the scope of the 
present invention. Such modifications include 
insertions, deletions and substitutions of one or more 
nucleotides . 

The 5 ' regulatory region of the present 
invention can be derived from restriction endonuclease 
or exonuclease digestion of an oleosin genomic clone. 
Thus, for example, the known nucleotide or amino acid 
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sequence of the coding region of an isolated oleosin 
gene (e.g. Fig. 7) is aligned to the nucleic acid or 
deduced amino acid sequence of an isolated oleosin 
genomic clone and 5' flanking sequence (i.e.. sequence 
upstream from the translational start codon of the 
coding region) of the isolated oleosin genomic clone 
located . 

The oleosin 5' regulatory region as set 
forth in SEQ ID NO: 12 (nucleotides 1-1267 of Fig. 9) 
may be generated from a genomic clone having either or 
both excess 5" flanking sequence or coding sequence by 
exonuclease ill -mediated deletion. This is 
accomplished by digesting appropriately prepared DNA 
with exonuclease III (exoIII) and removing aliquots at 
increasing intervals of time during the digestion. 
The resulting successively smaller fragments of DNA 
may be sequenced to determine the exact endpoint of 
the -deletions. There are several commercially 
available systems which use exonuclease III (exoIII) 
to create such a deletion series, e.g. Promega 
Biotech, "Erase-A-Base" system. Alternatively. PGR 
primers can be defined to allow direct amplification 
of the subject 5' regulatory regions. 

Using the same methodologies, the 
ordinarily skilled artisan can generate one or more 
deletion fragments of nucleotides 1-1267 as set forth 
in SEQ ID NO: 12. Any and all deletion fragments which 
comprise a contiguous portion of nucleotides set forth 
in SEQ ID NO: 12 and which retain the capacity to 
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direct seed - speci f ic expression are contemplated by 
the present invention. 

The identification of oleosin 5' regulatory 
sequences which direct seed- specif ic expression 
comprising nucleotides 1-1267 of SEQ ID NO: 12 and 
modifications or deletion fragments thereof, can be 
accomplished by transcriptional fusions of specific 
sequences with the coding sequences of a heterologous 
gene, transfer of the -chimeric gene into an 
appropriate host, and detection of the expression of 
the heterologous gene. The assay used to detect 
expression depends upon the nature of the heterologous 
sequence. For example, reporter genes, exemplified by 
chloramphenicol acetyl transferase and f5 - glucuronidase 
(GUS) , are commonly used to assess transcriptional and 
transla tional competence of chimeric constructions. 
Standard assays are available to sensitively detect, 
the reporter enzyme in a transgenic organism. The p- 
glucuronidase (GUS) gene is useful as a reporter of 
promoter activity in transgenic plants because of the 
high stability of the enzyme in plant cells, the lack 
of intrinsic - glucuronidase activity in higher plants 
and availability of a quantitative fluorimetric assay 
and a his tochemical localization technique. Jefferson 
et al . (1987 EMBO J 5:3901) have established standard 
procedures for biochemical and his tochemical detection 
of GUS activity in plant tissues. Biochemical assays 
are performed by mixing plant tissue lysates with 4- 
methylumbel li f eryl - 3 - D- glucuronide , a f luorimetric 
substrate for GUS, incubating one hour at 37°C, and 
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then measuring the fluorescence of the resulting 4- 
methyl - umbell i f erone . His tochemical localization for 
GUS activity is determined by incubating plant tissue 
samples in 5 -bromo- 4 - chloro- 3 - indolyl -glucuronide (X- 
Gluc) for about 18 hours at and observing the 

staining pattern of X-Gluc. The construction of such 
chimeric genes allows definition of specific 
regulatory sequences and demonstrates that these 
sequences can direct expression of heterologous genes 
in a seed - speci f ic manner. 

Another aspect of the invention is directed 
to expression cassettes and expression vectors (also 
termed herein "chimeric genes") comprising a 5' 
regulatory region from an oleosin gene which directs 
seed specific expression operably linked to the coding 
sequence of a heterologous gene such that the 
regulatory element is capable of controlling 
expression of the product encoded by the heterologous 
gene. The heterologous gene can be any gene other 
than oleosin. If necessary, additional regulatory 
elements or parts of these elements sufficient to 
cause expression resulting in production of an 
effective amount of the polypeptide encoded by the 
heterologous gene are included in the chimeric 
cons truct s . 

Accordingly, the present invention provides 
chimeric genes comprising sequences of the oleosin 5* 
regulatory region that confer seed - speci fic expression 
which are operably linked to a sequence encoding a 
heterologous gene such as a lipid metabolism enzyme. 
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Examples of lipid metabolism genes useful for 
practicing the present invention include lipid 
desaturases such as - desaturases , A12 - desaturases , 
A15 -desaturases and other related desaturases such as 
stearoyl-ACP desaturases, acyl carrier proteins 
(ACPs) , thioesterases, acetyl transacylases , acetyl - 
coA carboxylases, ketoacyl - synthases , malonyl 
transacylases , and elongases. Such lipid metabolism 
genes have been isolated and characterized from a 
number of different bacteria and plant species. Their 
nucleotide coding sequences as well as methods of 
isolating such coding sequences are disclosed in the 
published literature and are widely available to those 
of skill in the art. 

In particular, the A6 - desaturase genes 
disclosed in U.S. Patent No. 5,552,306 and 
applicants' copending U.S. Application Serial No. 
08/366,779 filed December 30, 1994 and incorporated 
herein by reference, are contemplated as lipid 
metabolism genes particularly useful in the practice 
of the present invention. 

The chimeric genes of the present invention 
are constructed by ligating a 5' regulatory region of 
a oleosin genomic DNA to the coding sequence of a 
heterologous gene. The juxtaposition of these 
sequences can be accomplished in a variety of ways. 
In a preferred embodiment the order of the sequences, 
from 5' to 3 ' , is an oleosin 5' regulatory region 
(including a promoter) , a coding sequence, and a 
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termination sequence which includes a polyadenylation 

Standard techniques for construction of such 
ch .errc ,enes are well .nown to those of ordinary 
skill rn the art and can be found in references such 
as Sambrook et al riflnoi » ■ ="ces such 

availab5» f «t al.(l939). A variety of strategies are 
ZiTT r '""^ fragments of DNA. the choice of 
Which depends on the nature of the termini of the DNA 
fragments. One of ordinary skill m the art 
recognizes that in order for the heterologous gene to 
be expressed, the construction requires promoter 
elements and sianalQ Fo.-k- • 

th<. tr-. ■ ^""^^ efficient polyadenylation of 

the transcript. Accordingly, the oleosln 5' 
regulatory region that contains the consensus promoter 
sequence known as the TATA box can be llgated directly 
to a promoterless heterologous coding sequence. 

o„ ^, ■ J^" deletion fragments that 

contain the oleosln TATA box are llgated in a forward 
orientation to a promoterless heterologous gene such 

Skill! °' B- glucuronidase (G«S, . The 

skilled artisan will recognize that the subject 

:e:"'"for -^^--^ P-vided by other 

means for example chemical or enzymatic synthesis. 
The 3 end of a heterologous coding sequence is 
optionally llgated to a termination sequence 
comprising a polyadenylation site, exemplified by. but 

site rl'",^"^ Polyadenyir^on 

Sit!' """'^^^^ ■> polyadenylation 

site Alternatively, the polyadenylation site can be 
provided by the heterologous gene 
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The present invention also provides methods 
of increasing levels of heterologous genes in plant 
seeds. In accordance with such methods, the subject 
expression cassettes and expression vectors are 
introduced into a plant in order to effect expression 
of a heterologous gene. For example, a method of 
producing a plant with increased levels of a product 
of a fatty acid synthesis or lipid metabolism gene is 
provided by transforming a plant cell with an 
expression vector comprising an oleosin 5' regulatory 
region operably linked to a fatty acid synthesis or 
lipid metabolism gene and regenerating a plant with 
increased levels of the product of said fatty acid 
synthesis or lipid metabolism gene. 

Another aspect of the present invention 
provides methods of reducing levels of a product of a 
gene which is native to a plant which comprises 
transforming a plant cell with an expression vector 
comprising a subject oleosin regulatory region 
operably linked to a nucleic acid sequence which is 
complementary to the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
antisense regulation. Thus, for example, levels of a 
product of a fatty acid synthesis gene or lipid 
metabolism gene are reduced by transforming a plant 
with an expression vector comprising a subject oleosin 
5' regulatory region operably linked to a nucleic acid 
sequence which is complementary to a nucleic acid 
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sequence coding for a native fatty acid synthesis or 
lipid metabolism gene. 

The present invention also provides a method 
of cosuppressing a gene which is native to a plant 
which comprises transforming a plant cell with an 
expression vector comprising a subject oleosin 5' 
regulatory region operably linked to a nucleic acid 
sequence coding for the native plant gene. In this 
manner, levels of endogenous product of the native 
plant gene are reduced through the mechanism known as 
cosuppression. Thus, for example, levels of a product 
of a fatty acid synthesis gene or lipid metabolism 
gene are reduced by transforming a plant with an 
expression vector comprising a subject oleosin 5' 
regulatory region operably linked to a nucleic acid 
sequence coding for a native fatty acid synthesis or 
lipid metabolism gene native to the plant. Although 
the -exact mechanism of cosuppression is not completely 
understood, one skilled in the art is familiar with 
published works reporting the experimental conditions 
and results associated with cosuppression (Napoli et 
al. 1990 The Plant Cell 2:270-289; Van der Krol 1990 
The Plant Cell 2:291-299. 

To provide regulated expression of the 
heterologous or native genes, plants are transformed 
with the chimeric gene constructions of the invention. 
Methods of gene transfer are well known in the art. 
The chimeric genes can be introduced into plants by 
leaf disk transforma tion - regeneration procedure as 
described by Horsch et al . 1985 Science 227:1229. 
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Other methods of transformation such as protoplast 
culture (Horsch et al . 1984 Science 223:496, DeBlock 
et al. 1984 EMBO J. 2:2143, Barton et al . 1983, Cell 
32:1033) can also be used and are within the scope of 
this invention. In a preferred embodiment, plants are 
transformed with Agrobacterium- d&rl^Qd vectors such as 
those described in Klett et al . (1987) Annu. Rev. 
Plant Physiol. 38:467. Other well-known methods are 
available to insert the chimeric genes of the present 
invention into plant cells. Such alternative methods 
include biolistic approaches (Klein et al . 1987 Nature 
327:70), electroporation , chemically- induced DNA 
uptake, and use of viruses or pollen as vectors. 

When necessary for the transformation 
method, the chimeric genes of the present invention 
can be inserted into a plant transformation vector, 
e.g. the binary vector described by Sevan, M. 1984 
Nucleic Acids Res. 12:8711-8721. Plant transformation 
vectors can be derived by modifying the natural gene 
transfer system of Agrobacterium tumefaciens. The 
natural system comprises large Ti ( tumor- inducing) - 
plasmids containing a large segment, known as T-DNA, 
which is transferred to transformed plants. Another 
segment of the Ti plasmid, the vir region, is 
responsible for T-DNA transfer. The T-DNA region is 
bordered by terminal repeats, in the modified binary 
vectors, the tumor inducing genes have been deleted 
and the functions of the vir region are utilized to 
transfer foreign DNA bordered by the T-DNA border 
sequences. The T- region also contains a selectable 
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marker for antibiotic resistance, and a multiple 
cloning site for inserting sequences for transfer- 
Such engineered strains are known as "disarmed" A. 
tumefaciens strains, and allow the efficient transfer 
of sequences bordered by the T- region into the nuclear 
genome of plants. 



susceptible tissues are inoculated with the "disarmed" 
foreign DNA- containing A., tumefaciens , cultured for a 
number of days, and then transferred to antibiotic - 
containing medium. Transformed shoots are then 
selected after rooting in medium containing the 
appropriate antibiotic, and transferred to soil. 
Transgenic plants are pollinated and seeds from these 
plants are collected and grown on antibiotic medium. 



gene in developing seeds, young seedlings and mature 
plants can be monitored by immunological, 
his tochemical or activity assays. As discussed 
herein, the choice of an assay for expression of the 
chimeric gene depends upon the nature of the 
heterologous coding region. For example, Northern 
analysis can be used to assess transcription if 
appropriate nucleotide probes are available. If 
antibodies to the polypeptide encoded by the 
heterologous gene are available. Western analysis and 
immunohistochemical localization can be used to assess 
the production and localization of the polypeptide. 
Depending upon the heterologous gene, appropriate 
biochemical assays can be used. For example. 



Surf ace - sterilized leaf disks and other 



Expression of a heterologous or reporter 
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acetyltransf erases are detected by measuring 
acetylation of a standard substrate. The expression 
of a lipid desaturase gene can be assayed by analysis 
of fatty acid methyl esters (FAMES) . 

Another aspect of the present invention 
provides transgenic plants or progeny of these plants 
containing the chimeric genes of the invention. Both 
monocotyledonous and dicotyledonous plants are 
contemplated. Plant cells are transformed with the 
chimeric genes by any of the plant transformation 
methods described above. The transformed plant cell, 
usually in the form of a callus culture, leaf disk, 
explant or whole plant (via the vacuum infiltration 
method of Bechtold et al . 1993 C.R. Acad. Sci . Paris, 
315:1194-1199) is regenerated into a complete 
transgenic plant by methods well-known to one of 
ordinary skill in the art (e.g. Horsch et al . 1985 
ScTence 227:1129) , In a preferred embodiment, the 
transgenic plant is sunflower, cotton, oil seed rape, 
maize, tobacco,. Arabidopsis , peanut or soybean. Since 
progeny of transformed plants inherit the chimeric 
genes, seeds or cuttings from transformed plants are 
used to maintain the transgenic line. 

The following examples further illustrate 
the invention. 
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EXAMPLE 1 

Isolation of Membrane - Bound Polysomal 
RNA and Construction of Borage cDNA Library 

Membrane - bound polysomes were isolated from 
borage seeds 12 days post pollination (12 DPP) using 
the protocol established for peas by Larkins and 
Davies (1975 Plant Phys . 55: 749-756). RNA was 
extracted from the polysomes as described by Mechler 
(1987 Methods in Enzymology 152: 241-248, Academic 
Press) - Poly -a" RNA was isolated from the membrane 
bound polysomal RNA using Oligotex-dT™ beads (Qiagen) , 

Corresponding cDNA was made using 
Stratagene's ZAP cDNA synthesis kit. The cDNA library 
was constructed in the lambda ZAP II vector 
(Stratagene) using the lambda ZAP II kit. The primary 
library was packaged with Gigapack II Gold packaging 
extract (Stratagene) . 
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RVAMPLE 2 

Isolation of a A- 6 Desaturase cDNA from Borage 

^Y-h-rrAi 7!=,tAon nrotOCPl 

The amplified borage cDNA library was plated 

at low density (500 pfu on 150 mm petri dishes) . 

Highly prevalent seed storage protein cDNAs were 

reduced (subtracted from the total cDNAs) by screening 

with the corresponding cDNAs . 

Hybridization probes for screening the 

borage cDNA library were generated by using random 

primed DNA synthesis as described by Ausubel £^ ai 
(1994 rnr-T-P.nt P7-o^nrols in Mnlernlar RioXogY/ Wiley 

Inter science, N.Y.) and corresponded to previously 
identified abundantly expressed seed storage protein 

CDNAS. unincorporated nucleotides were removed by use 

of a G-50 spin column (Boehringer Manheim) . Probe was 
denatured for hybridization by boiling in a water bath 
for 5 minutes, then quickly cooled on ice. 
Nitrocellulose filters carrying fixed recombinant 
bacteriophage were prehybridized at 60°C for 2-4 hours 
in hybridization solution [4X SET (600 mM NaCl , 80 mM 
Tris-HCl, 4 mM Na^EDTA; pH 7.8), 5X Denhardt's reagent 
(0.1% bovine serum albumin, 0.1% Ficoll, and 0.1% 
polyvinylpyrolidone) , 100 ug/ml denatured salmon sperm 
DNA, 50 vig/ml polyadenine and 10 ug/ml polycytidine] . 
This was replaced with fresh hybridization solution to 
which denatured radioactive probe (2 ng/ml 
hybridization solution) was added. The filters were 
incubated at 60°C with agitation overnight. Filters 
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were washed sequentially in 4X 2x 

NaCl, 20 mM Tris-HCl l n^vr ^^"^ (^50 mM 

-inutes each at 60^c pT ^"'''^ 
exposed to x-ray fi,, 2rhoIrr ^'l """" 

screens at -80°c. '^''''^ intensifying 

Resulting bacterial Protocol and reagents, 
by an ABI automated sequencer '"""""'^^'^ ".anually or 

--^....n/:r:;::^r::::rroV° ^ 

tag generated fro™ 200-300 1 ^""^ ^ sequence 

"as performed by cycle seouen^ ^-Qu^ncing 
300 expressed sequence tags I', '^'>^""-' ■ Over 
-ac.' sequence tag Was cojared L ^7 
usang the BLAST algorithm (Alt ! "^^""^ database 

Biol. ^15..403 410) '"'""'^"^ al. 1990 ^. 
genes, including the A6-'desatu"^" °' "'''^ -^tabolism 

Database searches " tr^'n"" 
designated mbp-65 using BLAs-xtitH T"" 
database resulted in a ■ '^''^ GenBank 

previously isolated Sy/:Zit::Ts\Tr '° ^'^^ 
"as determined however th»f k '''^^a turase . it 

length cDNA. a full i,' " f^'" a full 

-^P-es to screen the bo" "°'"^<^ 
library. .be resulta": -^ — 1 
the CDMA insert of pANl was ! "^^^ignated pANl and 

sequenced by the cycle 
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sequencing .ethod. The amino acid sequence deduced 
from the open reading frame (Fig. l, SEQ id NO-1) was 
compa a to other Known desaturases using OenewoLs 
(intellrgcenetics) protein alignment program. This 
alignment indicated that the cONA insert of pAm „as 
the borage A6-desaturase gene. 

that . ''"^ ■^^^"Iting dendrogram (Figure 2) shows 
that A -desaturases and A"-desaturases comprise two 
.roups. The newly isolated borage sequence and the 
previously isolated Syn.c^ocy.Us A'-desaturase (u s 
Patent «o. 5,552.306, formed a third distinct grlup 
A comparison of amino acid motifs common to 
desaturases and thought to be involved catalytically 
m metal binding illustrates the overall similarity of 
the protein encoded by the borage gene to desaturases 
m general and the 5y.ecAoci.stis A'-desaturase in 
particular (Table l,. ^t the same time, comparison of 
the motifs m Table 1 indicates definite differences 
be w ,,,3 ^^^^^^^ ^^^^^ ^^^^^ desaturases. 

Furthermore, the borage sequence is also distinguished 
from known plant membrane associated fatty acid 

co:s":::r^ ""^^^^"^ °^ ^ ■^^-^ ^^^^^^ 

conserved in cytochrome b, proteins (Schmidt et al 
1994 p.ant Mo2 . Biol. 25,631 - 64^, (Figure 1,. ThuV 
While these results clearly suggested that the 

coT "mliT ^ """"^'^ -'-^esaturase gene, further 

confirmation was necessary. To confirm the identity 
Of the borage A6 -desaturase cDNA, the cDNA insert from 
PANl was Cloned into an expression cassette f^rstabi: 
expression. The vector pBI121 (Jefferson et al 1987 
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EMBO J. 6.: 3901 - 3907) was prepared for ligation by 
digestion with BamHI and EcoICR I (an isoschizomer of 
Sad which leaves blunt ends; available from Promega) 
which excises the GUS coding region leaving the 3 5S 
promoter and NOS terminator intact. The borage A^- 
desaturase cDNA was excised from the recombinant 
plasmid (pANl) by digestion with BamHI and Xhol . The 
Xhol end was made blunt by performing a fill-in 
reaction catalyzed by the Klenow fragment of DNA 
polymerase I. This fragment was then cloned into the 
BamHi/EcoICR I sites of pBI121.1, resulting in the 
plasmid pAN2 . 
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EXAMPLE 3 

Production of Transgenic 
Plants cind Preparation and 
Analysis of Fatty Acid Methyl Esters (FAMEs) 

The expression plasmid, pAN2 was used to 
transform tobacco {Nicotiana tabacum cv, xanthi) via 
Agrojbacteriu/n tumefaciens according to standard 
procedures (Horsch, et al . 1985 Science 227:1229-1231; 
Bogue et al. 1990 Mol . Gen. Genet. 221:49-57) except 
that the initial transf ormants were selected on 100 
/uq/ml kanamycin. 

Tissue from transgenic plants was frozen in 
liquid nitrogen and lyophilized overnight. FAMEs were 
prepared as described by Dahmer, et al . (1989) J. 
Amer. Oil, Chem. Soc. 66: 543-548. In some cases, the 
solvent was evaporated again, and the FAMEs were 
resuspended in ethyl acetate and extracted once with 
deionized water to remove any water soluble 
contaminants. FAMEs were analyzed using a Tracer- 560 
gas liquid chromatograph as previously described 
(Reddy et al . 1996 Nature Biotech, 14:639-642). 

As shown in Figure. 3, transgenic tobacco 
leaves containing the borage cDNA produced both GLA 
and octadecatetraenoic acid (OTA) (18:4 A6,9,12,15). 
These results thus demonstrate that the isolated cDNA 
encodes a borage A6 - desa turase . 
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EXAMPLR 4 

Expression of A6-desaturase in Borage 

The native expression of A6 - desaturase was 
examined by Northern Analysis of RNA derived from 
borage tissues. RNA was isolated from developing 
borage embryos following the method of Chang et al . 
1993 Plant Mol . Biol. Rep. 11:113-116. RNA was 
electrophoretically separated on formaldehyde- agarose 
gels, blotted to nylon membranes by capillary 
transfer, and immobilized by baking at 80°C for 30 
minutes following standard protocols (Brown T., 1996 
in Current Protocols in Molecular Biology, eds . 
Auselbel, et al . [Greene Publishing and Wiley- 
Interscience, New York] pp. 4.9.1-4,9.14.). The 
filters were preincubated at 42°C in a solution 
containing 50% deionized formamide, 5X Denhardt's 
reagent, 5X SSPE (900 mM NaCl; S.OmM Sodium phosphate, 
PH7.7; and 5 mM EDTA) , 0.1% SDS , and 200 ug/ml 
denatured salmon sperm DNA. After two hours, the 
filters were added to a fresh solution of the same 
composition with the addition of denatured radioactive 
hybridization probe. In this instance, the probes 
used were borage legumin cDNA iFig. 16A) , borage 
oleosin CDNA (Fig. 16B) , and borage A6 - desaturase cDNA 
(pANl, Example 2) (Fig. 16C) . The borage legumin and 
oleosin cDNAs were isolated by EST cloning and 
identified by comparison to the GenBank database using 
the BLAST algorithm as described in Example 2. 
Loading variation was corrected by normalizing to 
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levels of borage EFla mRNA. EFla mRNA was identified 
by correlating to the corresponding cDNA obtained by 
the EST analysis described in Example 2. The filters 
were hybridized at 42°C for 12-20 hours, then washed 
as described above (except that the temperature was 
65°C) , air dried, and exposed to X-ray film. 



desaturase is expressed primarily in borage seed. 
Borage seeds reach maturation between 18-20 days post 
pollination (dpp) . A6 - desaturase mRNA expression 
occurs throughout the time points collected (8-20 
dpp), but appears maximal from 10-16 days post 
pollination. This expression profile is similar to 
that seen for borage oleosin and 125 seed storage 
protein mRNAs (Figs. 16A, 16B, and 16C) . 



As depicted in Figs. 15A and 15B, A6 - 
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EXAMPLE 5 

Isolation and Characterization of a Novel Oleosin cDNA 

The oleosin cDNA (AtS21) was isolated by 
virtual subtraction screening of an Arabidopsis 
developing seed cDNA library using a random primed 
polymerase chain reaction (RP-PCR) cDNA probe derived 
from root tissue. 

RNA pre;paration 

Arabidopsis thaliana Landsberg erecta plants 
were grown under continuous illumination in a 
vermiculite/soil mixture at ambient temperature 
(22°C) . Siliques 2-5 days after flowering were 
dissected to separately collect developing seeds and 
silique coats. Inflorescences containing initial 
flower buds and fully opened flowers, leaves, and 
whol-e siliques one or three days after flowering were 
also collected. Roots were obtained from seedlings 
that had been grown in Gamborg liquid medium (GIBCO 
BRL) for two weeks. The seeds for root culture were 
previously sterilized with 50% bleach for five minutes 
and rinsed with water extensively. All tissues were 
frozen in liquid nitrogen and -stored at -80''C until 
use. Total RNAs were isolated following a hot 
phenol/SDS. extraction and LiCl precipitation protocol 
(Harris et al. 1978 Biochem, i 7: 3251 - 3256 ; Galau et 
al . 1981 l7. Biol, Chem. 256 : 2551 - 2560) . Poly A+ RNA 
was isolated using oligo dT column chromatography 
according to manufacturers' protocols (PHARMACIA or 
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STRATAGENE) or using oligotex-dT latex particles 
(QIAGEN) . 

Construction of tissue - soeci f ic cDNA libraries 

Flower, one day silique, three day silique, 
leaf, root, and developing seed cDNA libraries were 
each constructed from 5 ug poly A+ RN using the ZAP 
cDNA synthesis kit (Stratagene) . cDNAs were 
directionally cloned into the EcoRI and Xhol sites of 
pBluescript SK ( - ) in the X-ZAPII vector (Short et al . 
1988 Nucleic Acids Res. 16:7583-7600). Nonrecombinant 
phage plaques were identified by blue color 
development on NZY plates containing X-gal (5 bromo-4- 
chloro - 3 - indoyl - 3 - D - galactopyranoside) and IPTG 
(isopropyl - 1 - thio- p-D-galactopyranoside) . The 
nonrecombinant backgrounds for the flower, one day 
silique, three day silique, leaf, root, and developing 
seed" CDNA libraries were 2.8%, 2%m 3.3%, 6.5%, 2.5%, 
and 1.9% respectively. 

Random orimina DNA labeling 

The cDNA inserts of isolated clones 
(unhybridized cDNAs) were excised by EcoRI/XhoI double 
digestion and gel -purified for "random priming 
labeling. Klenow reaction mixture contained 50 ng DNA 
templates, 10 mM Tris-HCl, pH 7.5, 5 mM MgCl2, 7.5 mM 
DTT, 5 0 uM each of dCTP, dGTP, and dTTP, 10 uM hexamer 
random primbers (Boehringer Mannheim), 50 uCi a- 32 P- 
dATP, 3000 Ci/mmole, 10 mCi/ml (DuPont) , and " 5 units 
of DNA polymerase I Klenow fragment (New England 
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precipitated or purified by passing through Sephadex 
G-50 spin columns (Boehringer Mannheim). 

Clone blot virtual subtraction 

Mass excision of X-ZAP cDNA libraries was 
carried out by co-infecting XLl-Blue MRF' host cells 
with recombinant phage from the libraries and ExAssist 
helper phage (STRATAGENE) . Excised phagemids were 
rescued by SOLR cells. Plasmid DNAs were prepared by 
boiling mini -prep method (Holmes et al . 1981 Anal. 
Biochem. 114:193-197) from randomly isolated clones. 
cDNA inserts were excised by EcoRI and Xhol double 
digestion, and resolved on 1% agarose gels. The DNAs 
were denatured in 0.5 N NaOH and 1.5 m NaCl for 4 5 
minutes, neutralized in 0.5 M Tris-HCl, pH 8.0, and 
1.5 M NaCl for 45 minutes, and then transferred by 
blotting to nylon membranes (Micron Separations, Inc.) 
in rOX SSC overnight. After one hour prehybr idization 
at 6 5°C, root RP-cDNA probe was added to the same 
hybridization buffer containing 1% bovine albumin 
fraction V (Sigma), 1 mM EDTA, 0.5 M NaHP04 , pH 7,2, 
7% SDS . The hybridization continued for 24 hours at 
65'^C. The filters were washed in 0.5% bovine albumin, 
1 mM EDTA, 4 0 mM NaHP04 , pH 7 . 2 , 5% SDS for ten 
minutes at room temperature, and 3 x 10 minutes in 1 
mM EDTA, 4 0 mM NaHP04 , pH 7 . 2 , 1% SDS at 5 5°C. 
Autoradiographs were exposed to X-ray films (Kodak) 
for two to five days at -80'^C. 

Hybridization of resulting blots with root 
RP-PCR probes "virtually subtracted" seed cDNAs shared 
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with the root mRNA population. The remaining seed 
cDNAs representing putative seed- specif ic cDNAs , 
including those encoding oleosins, were sequenced by 
the cycle sequencing method, thereby identifying AtS21 
as an oleosin cDNA clone. 



Seque nce analysis of AtS21 

The oleosin cDNA is 834 bp long including an 
18 bp long poly A tail (Fig. 4, SEQ ID NO:2) It has 
high homology to other oleosin genes from Arabidopsis 
as well as from other species. Recently, an identical 
oleosin gene has been reported (Zou, et al . , 1996, 
Plant Mol.Biol. 31:429-433). The predicted protein is 
191 amino acids long with a highly hydrophobic middle 
domain flanked by a hydrophilic domain on each side. 
The existence of two upstream in frame stop codons and 
the similarity to other oleosin genes indicate that 
this' cDNA is full-length. Since there are two in frame 
stop codons just upstream of the first ATG, this cDNA 
is considered to be a full length cDNA (Figure 4, SEQ 
ID NO:2). The predicted protein has three distinctive 
domains based on the distribution of its amino acid 
residues. Both the N- terminal and C- terminal domains 
are rich in charged residues while the central domain 
is absolutely hydrophobic (Figure 5) . As many as 20 
leucine residues are located in the central domain and 
arranged as repeats with one leucine occurring every 
7-10 residues. Other non-polar amino acid residues 
are also clustered in the central domain making this 
domain absolutely hydrophobic (Figure 6) . 
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Extensive searches of different databases 



using both AtS21 cDNA and its predicted protein 
sequence identified oleosins from carrot, maize, 
cotton, rapeseed, Arabidopsis , and other plant 
species. The homology is mainly restricted to the 
central hydrophobic domain. Seven Arabidopsis oleosin 
sequences were found. AtS21 represents the same gene 
as Z54164 which has a few more bases in the 5' 
untranslated region. The seven Arabidopsis oleosin 
sequences available so far were aligned to each other 
(Figure 7). The result suggested that the seven 
sequences fall into ' three groups. The first group 
includes AtS21 (SEQ ID NO:5), X91918 (SEQ ID NO:6), 
and the partial sequence Z29859 (SEQ ID NO: 7) . Since 
X91918 (SEQ ID NO: 5) has only its last residue 
different from AtS21 (SEQ ID NO: 5), and since Z29859 
(SEQ ID NO: 7) has only three amino acid residues which 
are .different from AtS21 (SEQ ID N0:5), all three 
sequences likely represent the same gene. The two 
sequences of the second group, X623 52 (SEQ ID NO: 8) 
and Atol3 (SEQ ID NO : 9 ) , are different in both 
sequence and length. Thus, there is no doubt that 
they represent two independent genes. Like the first 
group, the two sequences of the third group, X919 5 6 
(SEQ ID NO:10) and L40954 (SEQ ID N0:11), also have 
only three divergent residues which may be due to 
sequence errors. Thus, X91956 (SEQ ID NO: 10) and 
L40954 (SEQ ID NO: 11) likely represent the same gene. 
Unlike all the other oleosin sequences which were 
predicted from cDNA sequences, X62352 (SEQ ID NO: 8) 
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was deduced from a genomic sequence (Van Rooigen et 
al. 1992 Plant Mol. Biol. 15:1177-1179). In 
conclusion, four different Arabidopsis oleosin genes 
have been identified so far, and they are conserved 
only in the middle of the hydrophobic domain. 

Northe rn AnalvHis 

In order to characterize the expression 
pattern of the native AtS21 gene. Northern analysis 
was performed as described in Example 4 except that 
the probe was the AtS21 cDNA (pANl insert) labeled 
with "p-dATP to a, specific activity of 5 x 10® cpm/ug . 

Results indicated that the AtS21 gene is 
strongly expressed in developing seeds and weakly 
expressed in silique coats (Figure 8A) . A much larger 
transcript, which might represent unprocessed AtS21 
pre-mRNA, was also detected in developing seed RNA. 
AtS21 was not detected in flower, leaf, root (Figure 
8A) , or one day silique RNAs . A different Northern 
analysis revealed that AtS21 is also' strongly 
expressed in imbibed germinating seeds (Figs. 13A and 
13B) 
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EXAMPLE 6 

Characterization of Oleosin 
Genomic Clones and Isolation of Oleosin Promoter 

Genomic clones were isolated by screening an 
Arabidopsis genomic DNA library using the full length 
cDNA (AtS21)as a probe. Two genomic clones were 
mapped by restriction enzyme digestion followed by 
Southern hybridization using the 5' half of the cDNA 
cleaved by Sad as a probe. A 2 kb Sacl fragment was 
subcloned and sequenced (Fig. 9, SEQ ID NO:35). Two 
regions of the genomic clone are identical to the cDNA 
sequence. A 395 bp 'intron separates the two regions. 

The copy number of AtS21 gene in the 
Arabidopsis genome was determined by genomic DNA 
Southern hybridization following digestion with the 
enzymes BamHI, EcoRI , Hindlll , Sad and Xbal, using 
the full length cDNA as a probe (Figure 8B) , A single 
band* was detected in all the lanes except Sad 
digestion where two bands were detected. Since the 
cDNA probe has an internal Sad site, these results 
indicated that AtS21 is a single copy gene in the 
Arabidopsis genome. Since it has been known that 
Arabidopsis genome contains different isoforms of 
oleosin genes, this Southern analysis also 
demonstrates that the different oleosin isoforms of 
Arabidopsis are divergent at the DNA sequence level. 

Two regions, separated by a 395 bp intron, 
of the genomic DNA fragment are identical to AtS21 
cDNA sequence. Database searches using the 5' 
promoter sequence upstream of AtS21 cDNA sequence did 
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not identify any sequence with significant homology. 
Furthermore, the comparison of AtS21 promoter sequence 
with another Arabidopsis oleosin promoter isolated 
previously ( Van Rooijen, et al . , 1992) revealed 
little similarity. The AtS21 promoter sequence is 
rich in A/T bases, and contains as many as 44 direct 
repeats ranging from 10 bp to 14 bp with only one 
mismatch allowed. Two 14 bp direct repeats, and a 
putative ABA response element are underlined in Figure 
9 . 
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EXAMPLE 7 

Construction of AtS21 
Promoter/GUS Gene Expression Cassette and Expression 
Patterns in Transgenic Arabidopsis and Tobacco 



Construction of AtS21 promoter /GUS gene expression 
cassette 

The 1267 bp promoter fragment starting from 
the first G upstream of the ATG codon of the genomic 
DNA fragment was amplified using PGR and fused to the 
GUS reporter gene for analysis of its activity. 
The promoter fragment of the AtS21 genomic clone was 
amplified by PGR' using the T7 primer 

GTAATAGGACTCAGTAT AGGGC ( SEQ ID NO: 13) and the 2 IP 
primer GGGGATCGTATACTAAAACTATAGAGTAAAGG (SEQ ID NO: 14) 
complementary to the 5 ' untranslated region upstream 
of the first ATG codon (Figure 9) . A BamHI cloning 
site was introduced by the 21P primer. The amplified 
fragment was cloned into the BamHI and Sad sites of 
pBluescript KS (Stratagene) . Individual clones were 
sequenced to check possible PGR mutations as well as 
the orientation of their inserts. The correct clone 
was digested with BamHI and Hindlll, and the excised 
promoter fragment (1.3 kb) was cloned into the 
corresponding sites of pBIlOl.l (Jefferson, R.A. 
1987a, Plant Afol . Bioi. Rep. 5:387-405; Jefferson et 
al . , 1987b, EMBO J. 5:39 01-3907) upstream of the GUS 
gene. The resultant plasmid was designated pAN5 (Fig. 
10) . The AtS21 promoter/GUS construct (pAN5) was 
introduced into both tobacco (by the leaf disc method, 
Horsch et al . , 1985; Bogue et al . 1990 Moi . Gen, Gen. 
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221:49-57) and Arabidopsis Colombia ecotype via vacuum 
infiltration as described by Bechtold, et al . (1993) 
C.R. Acad, Sci . Paris, 315:1194-1199. Seeds were 
sterilized and selected on media containing 50 ixg/ml 
kanamycin, 500 /ug/ml carbenicillin . 
GUS activity assay : Expression patterns of the 
reporter GUS gene were revealed by his tochemical 
staining (Jefferson, et al . , 1987a, Plant Mol . Biol. 
Rep. 5:387-405). Different tissues were stained in 
substrate solution containing 2 mg/ml 5-bromo-4- 
chloro - 3 - indolyl -3-D - glucuronic acid (X - Glue) 
(Research Organics,' Inc.), 0.5 mM potassium 
f errocyanide, and 0.5 mM potassium ferricyanide in 50 
mM sodium phosphate buffer, pH 7.0 at 37''C overnight, 
and then dehydrated successively in 20%, 40% and 80% 
ethanol (Jefferson, et al . , 1987). Photographs were 
taken using an Axiophot (Zeiss) compound microscope or 
Olympus SZHIO dissecting microscope. Slides were 
converted to digital images using a Spring/Scan 35LE 
slide scanner (Polaroid) and compiled using Adobe 
Photoshop™ 3.0.5 and Canvas™ 3.5. 

GUS activities were quantitatively measured 
by fluorometry using 2 mM 4 -MUG (4 -methylumbell if eryl - 
3 -D-glucuronide) as substrate -(Jefferson, et al , , 
1987) . Developing Arabidopsis seeds were staged 
according to their colors, and other plant tissues 
were collected and kept at -80''C until use. Plant 
tissues were ground in extraction buffer containing 50 
mM sodium phosphate, pH 7 . 0 , 10 mM EDTA, 10 mM 3- 
mercaptoethanol , 0.1% Triton X-100, and 0.1% sodium 
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lauryl sarcosine. The tissue debris was removed by 5 
minutes centrif ugation in a microfuge. The 
supernatant was aliquoted and mixed with substrate and 
incubated at 37°C for 1 hour. Three replicas were 
assayed for each sample. The reactions were stopped 
by adding 4 volumes of 0,2 M sodium carbonate. 
Fluorescence was read using a TKO-100 DNA fluorometer 
(Hoefer Scientific Instruments) , Protein 
concentrations of the extracts were determined by the 
Bradford method (Bio Rad) . 

Expression patterns 'of AtS21 oromoter/GUS in 
transgenic Arabidopsis and tobacco- 

In Arabidopsis , GUS activity was detected in 
green seeds, and node regions where siliques,. cauline 
leaves and branches join the inflorescence stem 
(Figures llA and IIB) . No GUS activity was detected 
in any leaf, root, flower, silique coat, or the 
internode regions of the inflorescence stem. Detailed 
studies of the GUS expression in developing seeds 
revealed that the AtS21 promoter was only active in 
green seeds in which the embryos had already developed 
beyond heart stage (Figures IIC and IIG) , The 
youngest embryos showing GUS activity that could be 
detected by histochemical staining were at early 
torpedo stage. Interestingly, the staining was only 
restricted to the lower part of the embryo including 
hypocotyl and embryonic radical. No staining was 
detected in the young cotyledons (Figures IID and 
HE) . Cotyledons began to be stained when the embryos 
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were at late torpedo or even early cotyledon stage 
(Figure IIF and IIH) . Later, the entire embryos were 
stained, and the staining became more intense as the 
embryos matured (Figures 111 and IIJ) . It was also 
observed that GUS gene expression was restricted to 
the embryos. Seed coat and young endosperm were not 
stained (Figure IIC) . 

GUS activity was also detected in developing 
seedlings. Young seedlings of 3-5 days old were 
stained everywhere. Although some root hairs close to 
the hypocotyl were stained (Figure IIK) , most of the 
newly formed structures such as root hairs, lateral 
root primordia and shoot apex were not stained 
(Figures IIL and UN). Later, the staining was 
restricted to cotyledons and hypocotyls when lateral 
roots grew from the elongating embryonic root. The 
staining on embryonic roots disappeared. No staining 
was observed on newly formed lateral roots, true 
leaves nor trichomes on true leaves (Figures IIM and 
UN) . 

AtS21 promoter/GUS expression patterns in 
tobacco are basically the same as in Arahidopsis . GUS 
activity was only detected in late stage seeds and 
different node regions of mature plants. In 
germinating seeds, strong staining was detected 
throughout the entire embryos as soon as one hour 
after they were dissected from imbibed seeds. Mature 
endosperm, which Arabidopsis seeds do not have, but 
not seed coat was also stained (Figure 12A) . The root 
tips of some young seedlings of one transgenic line 
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were not stained (Figure 12B) . Otherwise, GUS 
expression patterns in developing tobacco seedlings 
were the same as in Arabidopsis seedlings (Figures 
12B, 12C, and 12D) . Newly formed structures such as 
lateral roots and true leaves were not stained. 



AtS21 mRNA 1 evel r in dev^lnpina fif^^^dl i nrrs 

Since the observed strong activities of 
AtS21 promoter/GUS in both Arabidopsis and tobacco 
seedlings are not consistent with the seed- specif ic 
expression of oleosin genes. Northern analysis was 
carried' out to determine if AtS21 mRNA was present in 
developing seedlings where the GUS activity was so 
strong. RNAs prepared from seedlings at different 
stages from 24 hours to 12 days were analyzed by 
Northern hybridization using AtS21 cDNA as the probe. 
Surprisingly, AtS21 mRNA was detected at a high level 
comparable to that in developing seeds in 24-48 hour 
imbibed seeds. The mRNA level dropped dramatically 
when young seedlings first emerged at 74 hours 
(Figures -13A and 13B) . m 96 hour and older 
seedlings, no signal was detected even with a longer 
exposure (Figure 13B) . The loadings of RNA samples 
were checked by hybridizing the same blot with a 
tubulin gene probe (Figure 13C) which was isolated and 
identified by EST analysis as described in Example 2. 
Since AtS21 mRNA was so abundant in seeds, residual 
AtS21 probes remained on the blot even after extensive 
stripping. These results indicated that AtS21 mRNA 
detected in imbibed seeds and very young seedlings are 
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the carry-over of AtS21 mRNA from dry seeds. It has 
recently been reported that an oleosin Atol2 mRNA 
(identical to AtS21) is most abundant in dry seeds 
(Kirik, et al., 1996 Plant Mol . Biol. 31 (2) :413 - 411 . ) 
Similarly, the strong GUS activities in seedlings were 
most likely due to the carry-over of both (5 - 
glucuronidase protein and the de novo synthesis of 3- 
glucuronidase from its mRNA carried over from the dry 
seed stage. 
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EXAMPLE 8 

Activity comparison between the 
AtS21 promoter and the 3 5S promoter 

The GUS activities in transgenic Arabidopsis 
developing seeds expressed by the AtS21 promoter were 
compared with those expressed by the 3 5S promoter in 
the construct pBI221 (Jefferson et al . EMBO J. ^;3901- 
3907) . The seeds were staged according to their 
colors (Table 2). The earliest stage was from 
globular to late heart stage when the seeds were still 
white but large enough to be dissected from the 
siligues. AtS21 promoter activity was detected at a 
level about three times lower than that of the 35S 
promoter at this stage. 35S promoter activity 
remained at the same low level throughout the entire 
embryo development. In contrast, AtS21 promoter 
activity increased quickly as the embryos passed 
torpedo stage and reached the highest level of 25,25 
pmole 4-MU/min. /uq protein at mature stage (Figure 5- 
8) . The peak activity of the AtS21 promoter is as 
much as 210 times higher than its lowest activity at 
globular to heart stage, and is close to 100 times 
higher than the 3 5S promoter activity at the same 
stage (Table 2) . The activity levels of the AtS21 
promoter are similar to those of another Arabidopsis 
oleosin promoter expressed in Brassica napus (Plant et 
al . 1994, Plant mol , Biol. 25:193-205. AtS21 promoter 
activity was also detected at background level in 
leaf. The high standard deviation, higher than the 
average itself, indicated that the GUS activity was 
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only detected in the leaves of some lines (Table 2) 
on the other hand, 35S promoter activity in leaf was 
more than 20 times higher than that in seed. The side 
by side comparisons of activities between AtS21 
promoter and 35S promoter is shown in Figure 14. 

Although the AtS21 promoter activity was 
about 3 times lower in dry seed of tobacco than in 
Arabidopsis dry seed, the absolute GUS activity was 
still higher than that expressed by the 35S promoter 
xn Arabidopsis leaf (Table 2) . No detectable AtS21 
promoter activity was observed in tobacco leaf (Figure 
14) . , . ^ 

Comparison of the AtS21 promoter versus the 
35S promoter revealed that the latter is not a good " 
promoter to express genes at high levels in developing 
seeds. Because of its consistent low activities 
throughout the entire embryo development period, 35S 
promoter is useful for consistent low level expression 
of target genes. On the other hand, the AtS21 
promoter is a very strong promoter that can be used to 
express genes starting from heart stage embryos and 
accumulating until the dry seed stage. The 35S 
promoter, although not efficient, is better than the 
AtS21 promoter in expressing. genes in embryos prior to 
heart stage. 
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g EXAMPLE 9 

In order to creaho 
"ith the AtS21 promoter drivir ""''""1°'' construct 
borage AC-desaturaae gene \h °f the 

from PANS was removed bv d< ""^ fragment 
--^CK I. insert oT " 

-cised «rst digesting wrthTor'T^'^ ''"^^ 
th^ residual overhang as LT '^"^ ""^"3 1" 

With smax. .he result^g LT 

the excised Portion' plj^. to replace 

After tr,n , yielding pAN3 . 

^ra^ic^opsis fono„7ng tT"" °' 

- ^'-.esaturase actTvit' ^r^ " ^^^''^^ ^- 

the corresponding fatty aoi/ ">°""°red by assaying 

reaction products, y "rnol °^ ^ts 

octadecatetraenoic ac d o^rus"" 

referred to i„ Example 3 ™ . "'^ 

(Table 3) of the transaen- ^"'^ "''^ ^^^^1 = 

fatty acids ,„ea: ! r^''^ — ^ to of 

respectively. no gla or OTA w""" ''^ = ^'l'''. 

leaves of these plants r ■^"eoted in the 

promoter/A'-desaturase "tr. CaMV 35 s 

levels in seeds ranging up trs'^.'^r'^ 

= 1.3%. and no measurable o^aI'" I"^^ ^^"^ 

uiA m seeds. 
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EXAMPLE in 

oitslltT^t^i - Expression 

Region M„.e. to tL"ll?a^e^2i?r^:'ol;aL%\"ie"|- 

wi^h ""^^^^ "^"^"^^ "^^ transformed 

with the strain o£ Ag-roiacteri.^ t^efacians EHA105 
containing the plasmid pA„3 (i.e. the borage A6- 
desaturase gene under the control of the Arabi^opsis 
oleosin promoter -Example 9) . 

Terminal internodes of Westar were co- 
cultivated for 2-3 days with induced ^^roiacteriu. 
tu^efaciens sti;ain EHA105 ,Alt-Moerbe et al. 1988 «oI 
^an. senet. 213:l-S; Barnes et al . 1993 Plant Cell 
Reports 12..559-5S3), then transferred onto 

325 " ■ ^^^^ ^-P-'- 5:321- 

3251 . The regenerated shoots were transferred to 

19T244T5nr - -o^.asn. «enet. 

191 244-250,, and a polymerase chain reaction (PCR, 
test was performed on leaf fragments to assess the 
presence of the gene. 

to th» "°^^'=^<5 f-C". the leaves according 

to the protocol of KM Haymes et al. ,1996, Plant 
Molecular Biolog-j. Reporter 14 (3, =280-284, and 
resuspended in lOOpl of water, without RNase 
treatment. 5ul of extract were used for the PCR 

perfor!"^ " ' °^ ■ ™- "action was 

performed in a PeDcin- Elmer 9600 thermocycler, with 

cne following cycles: 
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1 cycle: 95°C, 5 minutes 

30 cycles: 95°C, 45 sec; 52°C, 45 sec 

72^*0, 1 minute 

1 cycle: 72°C, 5 minutes 

and the following primers (derived from near the metal 
box regions, as indicated in Fig. 1, SEQ, NO.:l): 
5 • TGG AAA TGG AAC CAT AA 3 * 
5 ' GGA AAC AAA TGA TGC TC 3 ' 

Amplification of the DNA revealed the expected 549 
base pair PGR fragment (Figure 17) . 

The positive shoots were transferred to 
elongation medium, then to rooting medium (DeBlock et 
al 1989 Plant Physiol. 9 1 : 594 - 701 >. Shoots with a well- 
developed root system were transferred to the 
greenhouse. When plants were well developed, leaves 
were collected for Southern analysis in order to 
assess gene copy number. 

Genomic DNA was extracted according to the 
procedure of Bouchez et al . (1996) Plant Molecular 
Biology Reporter 14:115-123, digested with the 
restriction enzymes Bgl I and/or Cia I, 

electrophoretically separated on agarose gel (Maniatis 
et al . 1982, in Molecular Cloning; a Laboratory 
Manual. Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor/NY) , and prepared for transfer to nylon 
membranes (Nytran membrane, Schleicher & Schuell) 
according to the instructions of the manufacturer. 
DNA was then transferred to membranes overnight by 
capillary action using 20XSSC (Maniatis et al . 1982). • 
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^^ollowing transfer 

cpm/pg ^ to a specific • • ^^^oi>e 

'obtained i„ <^"ta 6-^,° "^"^^^^"t'' 

— -Shed „V ''^'''^^^-"on. \\r. r 

days. Exposure ti:.e was . ^ 

^as generally 3 

the gene ^^«^its obtained conf.' 

^ represents th °^ digested h o 

^"--t o. 3«5lp.' ^ --.her venerates a 

The term "r^r^„ 
defined as . ^°"'Prises" or 

f^eatur-es, i^t. ^ Presence of .-h 

/ integers, st*:.,,.. stated 

^" ^'^e but do ' °" --Ponents as ' 

one or- more ofh ® Presence o^, 

'^oraponents, or ^'eatures ^ 

> or groups thereof. ' '"^^^--s, steps, 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Rhone Poulenc Agro 

Thomas, Terry L. 
Li, Zhongsen 

(ii) TITLE OF INVENTION: AN OLEOSIN 5» REGULATORY REGION FOR THE 

MODIFICATION OF PLANT SEED LIPID COMPOSITION 

(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Scully, Scott, Murphy & Presser 

(B) STREET: 400 Garden City Plaza 

(C) CITY: Garden City 

(D) STATE: New York 

(E) COUNTRY: USA . 

(F) ZIP: 11530 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS/MS - DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/831,575 

(B) FILING DATE: 9 April 19 9 7 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: DiGiglio, Frank S. 

(B) REGISTRATION NUMBER: 31,34 6 

(C) REFERENCE/DOCKET NUMBER: 10203 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (516) 742-4343 

(B) TELEFAX: (516) 742-4366 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 43.. 1387 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATATCTGCCT ACCCTCCCAA AGAGAGTAGT CATTTTTCAT CA ATG GCT GCT CAA 54 

Met Ala Ala Gin 
1 

ATC AAG AAA TAC ATT ACC TCA GAT GAA CTC AAG AAC CAC GAT AAA CCC 102 
He Lys Lys Tyr He Thr Ser Asp Glu Leu Lys Asn His Asp Lys Pro 
5 10 15 20 

GGA GAT CTA TGG ATC TCG ATT CAA GGG AAA GCC TAT GAT GTT TCG GAT 150 
Gly Asp Leu Trp He Ser He Gin Gly Lys Ala Tyr Asp Val Ser Asn 

25 30 35 

TGG GTG AAA GAC CAT CCA GGT GGC AGC TTT CCC TTG AAG AGT CTT GCT 198 
Trp Val Lys Asp His Pro Gly Gly Ser Phe Pro Leu Lys Ser Leu Ala 
40 45 50 

GGT CAA GAG GTA ACT GAT GCA TTT GTT GCA TTC CAT CCT GCC TCT ACA 246 
Gly Gin Glu Val Thr Asp Ala Phe Val Ala Phe His Pro Ala Ser Thr 
55 60 65 

TGG AAG AAT CTT GAT AAG TTT TTC ACT GGG TAT TAT CTT AAA GAT TAC 294 
Trp Lys Asn Leu Asp Lys Phe Phe Thr Gly Tyr Tyr Leu Lys Asp Tyr 
70 75 80 

TCT GTT TCT GAG GTT TCT AAA GAT TAT AGG AAG CTT GTG TTT GAG TTT 3 42 

Ser Val Ser Glu Val Ser Lys Asp Tyr Arg Lys Leu Val Phe Glu Phe 
85 90 95 100 

TCT AAA ATG GGT TTG TAT GAC AAA AAA GGT CAT ATT ATG TTT GCA ACT 3 90 

Ser Lys Met Gly Leu Tyr Asp Lys Lys Gly His He Met Phe Ala Thr 

105 110 115 

TTG TGC TTT ATA GCA ATG CTG TTT GCT ATG AGT GTT TAT GGG GTT TTG 438 
Leu Cys Phe He Ala Met Leu Phe Ala Met Ser Val Tyr Gly Val Leu 
120 125 . 130 

TTT TGT GAG GGT GTT TTG GTA CAT TTG TTT TCT GGG TGT TTG ATG GGG 4 86 

Phe Cys Glu Gly Val Leu Val His Leu Phe Ser Gly Cys Leu Met Glv 
135 140 145 

TTT CTT TGG ATT CAG AGT GGT TGG ATT GGA CAT GAT GCT GGG CAT TAT 53 4 

Phe Leu Trp He Gin Ser Gly Trp He Gly His Asp Ala Gly His Tyr 
150 155 160 

ATG GTA GTG TCT GAT TCA AGG CTT AAT AAG TTT ATG GGT ATT TTT GCT 582 
Met Val Val Ser Asp Ser Arg Leu Asn Lys Phe Met Gly He Phe Ala 
165 170 175 180 

GCA AAT TGT CTT TCA GGA ATA AGT ATT GGT TGG TGG AAA TGG AAC CAT 63 0 

Ala Asn Cys Leu Ser Gly He Ser He Gly Trp Trp Lys Trp Asn His 

185 190 195 
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AAT GCA CAT CAC ATT GCC TGT AAT AGC CTT GAA TAT GAC CCT GAT TTA 

Asn Ala His His lie Ala Cys Asn Ser Leu Glu Tyr Asp Pro Asp Leu 
200 205 210 

CAA TAT ATA CCA TTC CTT GTT GTG TCT TCC AAG TTT TTT GGT TCA CTC 
Gin Tyr lie Pro Phe Leu Val Val Ser Ser Lys Phe Phe Gly Ser Leu 
215 220 225 

ACC TCT CAT TTC TAT GAG AAA AGG TTG ACT TTT GAC TCT TTA TCA AGA 
Thr Ser His Phe Tyr Glu Lys Arg Leu Thr Phe Asp Ser Leu Ser Arg 
230 235 240 

TTC TTT GTA AGT TAT CAA CAT TGG ACA TTT TAC CCT ATT ATG TGT GCT 
Phe Phe Val Ser Tyr Gin His Trp Thr Phe Tyr Pro lie Met Cys Ala 
245 250 255 260 

GCT AGG CTC AAT ATG TAT GTA CAA TCT CTC ATA ATG TTG TTG ACC AAG 
Ala Arg Leu Asn Met Tyr Val Gin Ser Leu lie Met Leu Leu Thr Lys 

265 270 275 

AGA AAT GTG TCC TAT CGA GCT ' CAG GAA CTC TTG GGA TGC CTA GTG TTC 
Arg Asn Val Ser Tyr Arg Ala Gin Glu Leu Leu Gly Cys Leu Val Phe 
280 285 290 

TCG ATT TGG TAC CCG TTG CTT GTT TCT TGT TTG CCT AAT TGG GGT GAA 
Ser lie Trp Tyr Pro Leu Leu Val Ser Cys Leu Pro Asn Trp Gly Glu 
295 300 305 

AGA ATT ATG TTT GTT ATT GCA AGT TTA TCA GTG ACT GGA ATG CAA CAA 
Arg lie Met t>he Val lie Ala Ser Leu Ser Val Thr Gly Met Gin Gin 
310 315 320 

GTT CAG TTC TCC TTG AAC CAC TTC TCT TCA AGT GTT TAT GTT GGA AAG 
Val Gin Phe Ser Leu Asn His Phe Ser Ser Ser Val Tyr Val Gly Lys 
325 330 335 340 

CCT AAA GGG AAT AAT TGG TTT GAG AAA CAA ACG GAT GGG ACA CTT GAC 
Pro Lys Gly Asn Asn Trp Phe Glu Lys Gin Thr Asp Gly Thr Leu Asp 

345 350 355 

ATT TCT TGT CCT CCT TGG ATG GAT TGG TTT CAT GGT GGA TTG CAA TTC 
lie Ser Cys Pro Pro Trp Met Asp Trp Phe His Gly Gly Leu Gin Phe 
360 365 370 

CAA ATT GAG CAT CAT TTG TTT CCC AAG ATG CCT AGA TGC AAC CTT AGG 
Gin lie Glu His His Leu Phe Pro Lys Met Pro Arg Cys Asn Leu Arg 
375 380 385 

AAA ATC TCG CCC TAC GTG ATC GAG TTA TGC AAG AAA CAT AAT TTG CCT 
Lys lie Ser Pro Tyr Val lie Glu Leu Cys Lys Lys His Asn Leu Pro 
390 395 400 

TAC AAT TAT GCA TCT TTC TCC AAG GCC AAT GAA ATG ACA CTC AGA ACA 

Tyr Asn Tyr Ala Ser Phe Ser Lys Ala Asn Glu Met Thr Leu Arg Thr 
405 410 415 420 
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TTG AGG AAC ACA GCA TTG CAG GCT AGG GAT ATA ACC AAG CCG CTC CCG 13 5 0 

Leu Arg Asn Thr Ala Leu Gin Ala Arg Asp lie Thr Lys Pro Leu Pro 

425 430 435 

AAG AAT TTG GTA TGG GAA GCT CTT CAC ACT CAT GGT T AAAATTACCC 1397 
Lys Asn Leu Val Trp Glu Ala Leu His Thr His Gly 
440 445 

TTAGTTCATG TAATAATTTG AGATTATGTA TCTCCTATGT TTGTGTCTTG TCTTGGTTCT 1457 

ACTTGTTGGA GTCATTGCAA CTTGTCTTTT ATGGTTTATT AGATGTTTTT TAATATATTT 1517 

TAGAGGTTTT GCTTTCATCT CCATTATTGA TGAATAAGGA GTTGCATATT GTCAATTGTT 1577 

GTGCTCAATA TCTGATATTT TGGAATGTAC TTTGTACCAC GTGGTTTTCA GTTGAAGCTC 1637 

ATGTGTACTT CTATAGACTT TGTTTAAATG GTTATGTCAT GTTATTT 1684 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Ala Gin He Lys Lys Tyr He Thr Ser Asp Glu Leu Lys Asn 
15 10 15 

His Asp Lys Pro Gly Asp Leu Trp He Ser He Gin Gly Lys Ala Tyr 
20 25 30 

Asp Val Ser Asp Trp Val Lys Asp His Pro Gly Gly Ser Phe Pro Leu 

35 40 45 

Lys Ser Leu Ala Gly Gin Glu Val Thr Asp Ala Phe Val Ala Phe His 
50 55 60 

Pro Ala Ser Thr Trp Lys Asn Leu Asp Lys Phe Phe Thr Gly Tyr Tyr 
65 70 75 80 

Leu Lys Asp Tyr Ser Val Ser Glu Val Ser Lys Asp Tyr Arg Lys Leu 

85 90 95 

Val Phe Glu Phe Ser Lys Met Gly Leu Tyr Asp Lys Lys Gly His He 
100 105 110 

Met Phe Ala Thr Leu Cys Phe He Ala Met Leu Phe Ala Met Ser Val 
115 120 125 

Tyr Gly Val Leu Phe Cys Glu 
130 135 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 31.. 603 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

TTAGCCTTTA CTCTATAGTT TTAGATAGAC ATG GCG AAT GTG GAT CGT GAT CGG 

Met Ala Asn Val Asp Arg Asp Arg 

140 

CGT GTG CAT GTA GAC CGT ACT GAC AAA CGT GTT CAT CAG CCA AAC TAC 
Arg Val His Val Asp Arg Thr Asp Lys Arg Val His Gin Pro Asn Tyr 
145 150 155 

GAA GAT GAT GTC GGT TTT GGT GGC TAT GGC GGT TAT GGT GCT GGT TCT 
Glu Asp Asp Val Gly Phe Gly Gly Tyr Gly Gly Tyr Gly Ala Gly Ser 
160 165 170 175 

GAT TAT AAG AGT CGC GGC CCC TCC ACT AAC CAA ATC TTG GCA CTT ATA 
ASD Tvr Lys Ser Arg Gly Pro Ser Thr Asn Gin lie Leu Ala Leu lie 

180 185 190 

GCA GGA GTT CCC ATT GGT GGC ACA CTG CTA ACC CTA GCT GGA CTC ACT 
Ala Gly Val Pro lie Gly Gly Thr Leu Leu Thr Leu Ala Gly Leu Thr 
195 200 205 



CTA GCC GGT TCG GTG ATC GGC TTG CTA GTC TCC ATA CCC CTC TTC CTC 
Leu Ala Gly Ser Val He Gly Leu Leu Val Ser He Pro Leu Phe Leu 
210 215 220 

CTC TTC AGT CGG GTG ATA GTC CCG GCG GCT CTC ACT ATT GGG CTT GCT 
Leu Phe Ser Pro Val He Val Pro Ala Ala Leu Thr He Gly Leu Ala 



GTG ACG GGA ATC TTG GCT TCT GGT TTG TTT GGG TTG ACG GGT CTG AGC 

Val Thr Gly He Leu Ala Ser Gly Leu Phe Gly Leu Thr Gly Leu Ser 

240 245 250 255 

TCG GTC TCG TGG GTC CTC AAC TAC CTC CGT GGG ACG AGT GAT ACA GTG 

Ser Val Ser Trp Val Leu Asn Tyr Leu Arg Gly Thr Ser Asp Thr Val 

260 265 270 
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CCA GAG CAA TTG GAC TAG GCT AAA CGG CGT ATG GCT GAT GCG GTA GGC 48 6 

Pro Glu Gin Leu Asp Tyr Ala Lys Arg Arg Met Ala Asp Ala Val Glv 
275 280 285 

TAT GCT GGT ATG AAG GGA AAA GAG ATG GGT CAG TAT GTG CAA GAT AAG 53 4 

Tyr Ala Gly Met Lys Gly Lys Glu Met Gly Gin Tyr Val Gin Asd Lvs 
290 295 300 

GCT CAT GAG GCT CGT GAG ACT GAG TTC ATG ACT GAG ACC CAT GAG CCG 582 
Ala His Glu Ala Arg Glu Thr Glu Phe Met Thr Glu Thr His Glu Pro 
305 310 315 

GGT AAG GCC AGG AG A GGC TCA TAAGCTAATA TAAATTGCGG GAGTCAGTTG 63 3 

Gly Lys Ala Arg Arg Gly Ser 
320 325 

GAAACGCGAT AAATGTAGTT TTACTTTTAT GTCCCAGTTT CTTTCCTCTT TTAAGAATAT 693 
CTTTGTCTAT ATATGTGTTC GTTCGTTTTG TCTTGTCCAA ATAAAAATCC TTGTTAGTGA 753 
AATAAGAAAT GAAATAAATA TGTTTTCTTT TTTGAGATAA CCAGAAATCT CATACTATTT 813 
TCTAAAAAAA AAAAAAAAAA A 



(2) INFORMATION FOR'SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
1 5 10 15 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 25 30 

Tyr Gly Gly Tyr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 45 

Thr Asn Gin lie Leu Ala Leu lie Ala Gly Val Pro lie Glv Glv Thr 
50 55 60 

Leu Leu Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val lie Gly Leu 
65 70 75 80 

Leu Val Ser lie Pro Leu Phe Leu Leu Phe Ser Pro Val lie Val Pro 

85 90 95 



834 
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Ala Ala Leu Thr lie Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 
100 105 110 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asm Tyr 
115 120 125 

Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 155 160 

Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 175 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
180 185 190 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
15 10 15 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 25 30 

Thr Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 " 45 

Thr Asn Gin lie Leu Ala Leu lie Ala Gly Val Pro lie Gly Gly Thr 
50 55 60 

Leu lie Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val lie Gly Leu 
65 70 75 80 

Leu Val Ser lie Pro Leu Phe Leu lie Phe Ser Pro Val lie Val Pro 

85 90 95 

Ala Ala Leu Thr lie Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 
100 105 110 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 
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Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 155 160 

Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 175 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
180 185 190 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Ala Asn Val Asp Arg Asp Arg Arg Val His Val Asp Arg Thr Asp 
15 10 15 

Lys Arg Val His Gin Pro Asn Tyr Glu Asp Asp Val Gly Phe Gly Gly 
20 25 30 

Thr Gly Gly Thr Gly Ala Gly Ser Asp Tyr Lys Ser Arg Gly Pro Ser 
35 40 45 

Thr Asn Gin lie Leu Ala Leu lie Ala Gly Val Pro lie Gly Gly Thr 
50 55 60 

Leu lie Thr Leu Ala Gly Leu Thr Leu Ala Gly Ser Val lie Gly Leu 
65 70 _ 75 80 

Leu Val Ser lie Pro Leu Phe Leu lie Phe Ser Pro Val lie Val Pro 

85 90 95 

Ala Ala Leu Thr lie Gly Leu Ala Val Thr Gly lie Leu Ala Ser Gly 
100 105 110 

Leu Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Asn Tyr 
115 120 125 

Leu Arg Gly Thr Ser Asp Thr Val Pro Glu Gin Leu Asp Tyr Ala Lys 
130 135 140 

Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu 
145 150 155 160 
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Met Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu 

165 170 175 

Phe Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Pro 
180 185 190 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Phe Gly Leu Thr Gly Leu Ser Ser Val Ser Trp Val Leu Gin Leu Pro 
1 5 . 10 15 

Pro Trp Ala Ser Asp Thr Val Pro Glu Gin Val Asp Tyr Ala Lys Arg 
20 25 30 

Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Met Lys Gly Lys Glu Met 
35 40 45 

Gly Gin Tyr Val Gin Asp Lys Ala His Glu Ala Arg Glu Thr Glu Phe 
50 55 60 

Met Thr Glu Thr His Glu Pro Gly Lys Ala Arg Arg Gly Ser 
65 70 75 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids - 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Ala Asp Thr Ala Arg Gly Thr His His Asp lie lie Gly Arg Asp 
15 10 15 

Gin Tyr Pro Met Met Gly Arg Asp Arg Asp Gin Tyr Gin Met Ser Gly 
20 25 30 

Arg Gly Ser Asp Tyr Ser Lys Ser Arg Gin lie Ala Lys Ala Ala Thr 
35 40 45 
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Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Ser Leu Thr Leu 
50 55 60 

Val Gly Thr Val Leu Ala Leu Thr Val Ala Thr Pro Leu Leu Val Leu 
65 70 75 80 

Phe Ser Pro lie Leu Val Pro Ala Leu lie Thr Val Ala Leu Leu He 

85 90 95 

Thr Gly Phe Leu Ser Ser Gly Gly Phe Gly He Ala Ala He Thr Val 
100 105 110 

Phe Ser Trp He Tyr Lys Tyr Ala Thr Gly Glu His Pro Gin Gly Ser 
115 120 125 

Asp Lys Leu Asp Ser Ala Arg Met Lys Leu Gly Ser Lys Ala Gin Asd 
130 135 140 

Leu Lys Asp Arg Ala Gin Tyr Tyr Gly Gin Gin His Thr Gly Gly Glu 
145 150 155 160 

His Asp Arg Asp Arg Thr Arg Gly Gly Gin His Thr Thr 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Ala Asp Gin Thr Arg Thr His His Glu Met He Ser Arg Asp Ser 
1 5 ' 10 ■ 15 

Thr Gin Glu Ala His Pro Lys Ala Arg Gin Trp Val Lys Ala Ala Thr 
20 25 30 

Ala Val Thr Ala Gly Gly Ser Leu Leu Val Leu Ser Gin Leu Thr Leu 
35 40 45 

Ala Gly Thr Val He Ala Leu Thr Val Ala Thr Pro Leu Leu Val He 
50 55 60 

Phe Ser Pro Val Leu Val Pro Ala Val Val Thr Val Ala Leu He He 
65 70 75 80 

Thr Gly Phe Leu Ala Ser Gly Gly Phe Gly He Ala Ala He Thr Ala 



165 



170 



85 



90 



95 
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Phe Ser Trp Leu Tyr Arg His Trp Thr Gly Ser Gly Ser Asp Lys lie 
100 105 110 

Glu Trp Ala Arg Met Lys Val.Gly Ser Arg Val Gin Asp Thr Lys Tyr 
115 120 125 

Gly Gin His Trp lie Gly Val Gin His Gin Gin Val Ser 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Asp Thr His Arg Val Asp Arg Thr Asp Arg His Phe Gin Phe 
15 10 15 

Gin Ser Pro Tyr Glu Gly Gly Arg Gly Gin Gly Gin Tyr Glu Gly Asp 
20 25 30 

Arg Gly Tyr Gly Gly Gly Gly Tyr Lys Ser Met Met Pro Glu Ser Gly 
35 40 45 

Pro Ser Ser Thr Gin Val Leu Ser Leu Leu lie Gly Val Pro Val Val 
50 55 60 

Gly Ser Leu lie Ala Leu Ala Gly Leu Leu Leu Ala Gly Ser Val lie 
65 70 75 80 

Gly Leu Met Val Ala Leu Pro Leu Phe Leu lie Phe Ser Pro Val lie 

85 - 90 95 

Val Pro Ala Gly Leu Thr lie Gly Leu Ala Met Thr Gly Phe Leu Ala 
100 105 110 

Ser Gly Met Phe Gly Leu Thr Gly Leu Ser Ser lie Ser Trp Val Met 
115 120 125 

Asn Tyr Leu Arg Gly Thr Ala Arg Thr Val Pro Glu Gin Leu Glu Tyr 
130 135 140 

Ala Lys Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Gin Lys Gly 
145 150 155 160 

Lys Glu Met Gly Gin His Val Gin Asn Lys Ala Gin Asp Val Lys Gin 



130 



135 



140 



170 



175 
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Tyr Asp lie Ser Lys Pro His Asp Thr Thr Thr Lys Gly His Glu Thr 
180 185 190 

Gin Gly Gly Thr Thr Ala Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ala Asp Thr His Arg Val Asp Arg Thr Asp Arg His Phe Gin Phe 
15 10 15 

i 

Gin Ser Pro Tyr Glu Gly Gly Arg Gly Gin Gly Gin Tyr Glu Gly Asp 
20 25 30 

Arg Gly Tyr Gly Gly Gly Gly Tyr Lys Ser Met Met Pro Glu Ser Gly 
35 40 45 

Pro Ser Ser Thr Gin Val Leu Ser Leu Leu lie Gly Val Pro Val Val 
50 55 60 

Gly Ser Leu lie Ala Leu Ala Gly Leu Leu lie Ala Gly Ser Val lie 
65 70 75 80 

Gly Leu Met Val Ala Leu Pro Leu Phe Leu lie Phe Ser Pro Val lie 

85 90 95 

Val Pro Ala Ala Leu Thr lie Gly Leu Ala Met Thr Gly Phe Leu Ala 
100 105 110 

Ser Gly Met Phe Gly Leu Thr Gly Leu Ser Ser lie Ser Trp Val Met 
115 120 125 

Asn Tyr Leu Arg Gly Thr Arg Arg Thr Val Pro Glu Gin Leu Glu Tyr 
130 135 140 

Ala Lys Arg Arg Met Ala Asp Ala Val Gly Tyr Ala Gly Gin Lys Gly 
145 150 155 160 

Lys Glu Met Gly Gin His Val Gin Asn Lys Ala Gin Asp Val Lys Gin 

165 170 175 

Tyr Asp lie Ser Lys Pro His Asp Thr Thr Thr Lys Gly His Glu Thr 
180 185 190 
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Gln Gly Arg Thr Thr Ala Ala 
195 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CH7VRACTERISTICS : 

(A) LENGTH: 1257 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GAGCTCGATC 


ACACAAAGAA 


AACGTCAAAT 


GGATCATACT 


GGGCCCATTT 


TGCAGACCAA 


60 


GAGAAAGTGA 


GAGAGAGTTG 


TCCTCTCGTT 


ATCAAGTAuAC 


AGTAGACCAC 


CACTAAACCG 


120 


CCAATAGCTT 


ATAATCAAAA 


TAGAAAGGTC 


TAATAACAGA 


AACAAATGAA 


AAAGCCTTGT 


180 


TCCATGGACT 


GCCTACCCGA 


ATTGATTGAT 


TCGACTAGTT 


TTTCTTCTTC 


TTTGATTAAG 


240 


ACCTCCGTAA 


gaaaaatggt 


ACTACTAAAG 


CCACTCGCTA 


CCAAAACTAA 


ACCATTCCAG 


300 


ACTGTAACTG 


GACCAATATT 


TCTAAACTGT 


AACCAGATCT 


CAAACATATA 


AACTAATTAA 


360 


GAACTATAAC 


CATTAACCGT 


AuAAAATAAAT 


TTACTACAGT 


AAAAAATTAT 


ACTAATTTCA 


420 


GCTATGATGG 


AATTTCAGCT 


CTTAAGAGTT 


GTGGAAATCA 


AGTAAACCTA 


AAATCCTAAT 


480 


AATATTCTTC 


ATCCTTATTT 


TTGTTTCACA 


TGCATGCTGT 


CCAATCTGTT 


ATTAGCATTT 


540 


GAAAGCCTAA 


AATTCTATAT 


ACAGTACAAT 


AAATCTAATT 


AATTTTCATT 


ACTAATAAAA 


600 


TGCTTCATAT 


ATACTCTTGT 


ATTTAT7UVAT 


CATCCGTTAT 


CGTTACTATA 


CCTTTATACA 


660 


TCATCCTACA 


TTCATACCTA 


AGCTAGCAAA 


GCAAA'CTACT 


AAAAGGGTCG 


TCAACGCAAG 


720. 


TTATTTGCTA 


GTTGGTGCAT 


ACTACACACG 


GCTACGGCAA 


CATTAAGTAA 


CACATTAAGA 


780 


GGTGTTTTCT 


TAATGTAGTA 


TGGTAATTAT 


ATTTATTTCA 


AAACTTGGAT 


TAGATATAAA 


840 


GGTACAGGTA 


GATGAAAAAT 


ATTTGGTTAG 


CGGGTTGAGA 


TTAAGCGGAT 


ATAGGAGGCA 


900 


TATATACAGC 


TGTGAGAAGA 


AGAGGGATAA 


ATACAAAAAG 


GGAAGGATGT 


TTTTGCCGAC 


960 


agagAaaggt 


AGATTAAGTA 


GGCATCGAGA 


GGAGAGCAAT 


TGTAAAATGG 


ATGATTTGTT 


1020 


TGGTTTTGTA 


CGGTGGAGAG 


AAGAACGAAA 


AGATGATCAG 


GTAAAAAATG 


AAACTTGGAA 


1080 


ATCATGCAAA 


GCCACACCTC 


TCCCTTCAAC 


ACAGTCTTAC 


GTGTCGTCTT 


CTCTTCACTC 


1140 


CATATCTCCT 


TTTTATTACC 


AAGAAATATA 


TGTCAATCCC 


ATTTATATGT 


ACGTTCTCTT 


1200 
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AGACTTATCT CTATATACCC CCTTTTAATT TGTGTGCTCT TAGCCTTTAC TCTATAGTTT 126( 
TAGATAG 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTAATACGAC TCACTATAGG GC 

22 

(2) INFORMATION FOR SEQ ID, NO: 14: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGGGATCCTA TACTAAAACT ATAGAGTAAA GG 

32 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Trp He Gly His Asp Ala Gly His 
1 5 



BNSOOCID: ■<WO_pe4«eiA1JA> 



wo 98/45461 



PCT/US98/07179 



-72- 

INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Asn Val Gly His Asp Ala Asn His 
1 5 



INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS.: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Val Leu Glv His Asp Cys Gly His 
1 5 



INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Val lie Ala His Glu Cys Gly His 
1 5 



INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 

Val He Gly His Asp Cys Ala His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Val Val Gly His Asp Cys Gly His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

His Asn Ala His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

His Asn Tyr Leu His His 
1 5 
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(2) INFORMATION FOR SEQ ID NO : 2 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

His Arg Thr His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

His Arg . Arg His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii-) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

His Asp Arg His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26 

His Asp Gin His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 

His Asp His His His 
1 5 



(2) INFORMATION FOR SEQ ID NO : 2 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

His Asn His His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : doubl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Phe Gin lie Glu His His 
1 5 
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(2) INFORMATION FOR SEQ ID NO : 3 0 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

His Gin Val Thr His His 
1 5 



(2) INFORMATION FOR SEQ ID NO : 3 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 1 : 

His Val lie His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

His Val Ala His His 
1 5 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein • 

(xi) SEQUEMCE DESCRIPTION: SEQ ID NO: 33: 

His lie Pro His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 4 : 

His Val Pro His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 35:. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1941 base pairs 

(B) , TYPE: nucleic acid 

{ C ) STRANDEDNESS : doubl e 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



GAGCTCGATC 


ACACAAAGAA 


AACGTCAAAT 


GGATCATACT 


GGGCGCATTT 


TGCAGACCAA 


60 


GAGAAAGTGA 


GAGAGAGTTG 


TCCTCTCGTT 


ATCAAGTAAC 


AGTAGACCAC 


CACTAAACCG 


120 


CCAATAGCTT 


ATAATCAAAA 


TAGAAAGGTC 


TAATAACAGA 


AACAAATGAA 


AAAGCCTTGT 


180 


TCCATGGACT 


GCCTACCCGA 


ATTGATTGAT 


TCGACTAGTT 


TTTCTTCTTC 


TTTGATTAAG 


240 


ACCTCCGTAA 


GAAAAATGGT 


ACTACTAAAG 


CCACTCGCTA 


CCAAAACTAA 


ACCATTCCAG 


300 


ACTGTAACTG 


GACCAATATT 


TCTAAACTGT 


AACCAGATCT 


CAAACATATA 


AACTAATTAA 


350 


GAACTAT7VAC 


CATTAACCGT 


AAAAATAAAT 


TTAGTACAGT 


AAAAAATTAT 


ACTAATTTCA 


420 


GCTATGATGG 


7VATTTCAGCT 


CTTAAGAGTT 


GTGGAAATCA 


AGTAAACCTA 


AAATCCTAAT 


480 


AATATTCTTC 


ATCCTTATTT 


TTGTTTCACA 


TGCATGCTGT 


CCAATCTGTT 


ATTAGCATTT 


540 


GAAAGCCTAA 


AATTCTATAT 


ACAGTACAAT 


TU^TCTAATT 


AATTTTCATT 


ACTAATAAAA 


600 
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TGCTTCATAT ATACTCTTGT ATTTATAAAT CATCCGTTAT CGTTACTATA CCTTTATACA 6 60 

TCATCCTACA TTCATACCTA AGCTAGCAAA GCAAACTACT AAAAGGGTCG TCAACGCAAG 72 0 

TTATTTGCTA GTTGGTGCAT ACTACACACG GCTACGGCAA CATTAAGTAA CACATTAAGA 780 

GGTGTTTTCT TAATGTAGTA TGGTAATTAT ATTTATTTCA AAACTTGGAT TAGATATAAA 840 

GGTACAGGTA GATGAAAAAT ATTTGGTTAG CGGGTTGAGA TTAAGCGGAT ATAGGAGGCA 900 

TATATACAGC TGTGAGAAGA AG AG GG AT AA AT AC AAAAAG GGAAGGATGT TTTTGCCGAC 9 60 

AGAGAAAGGT AGATTAAGTA GGCATCGAGA GGAGAGCAAT TGTAAAATGG ATGATTTGTT 1020 

TGGTTTTGTA CGGTGGAGAG AAGAACGAAA AGATGATCAG GTAAAAAATG AAACTTGGAA 1080 

ATCATGC7VAA GCCACACCTC TCCCTTCAAC ACAGTCTTAC GTGTCGTCTT CTCTTCACTC 1140 

CATATCTCCT TTTTATTACC AAGAAATATA TGTCAATCCC ATTTATATGT ACGTTCTCTT 12 00 

AGACTTATCT CTATATACCC CCTTTTAATT TGTGTGCTCT TAGCCTTTAC TCTATAGTTT 12 6 0 

TAGATAGACA TGGCG AATGT GGATCGTGAT CGGCGTGTGC ATGTAGACCG TACTGACAAA 13 20 

CGTGTTCATC AGCCAAACTA CGAAGATGAT GTCGGTTTTG GTGGCTATGG CGGTTATGGT 13 80 

GCTGGTTCTG ATTATAAGAG TCGCGGCCCC TCCACTAACC AAGTATTTTT GTGGTCTCTT 1440 

TAGTTTTTCT TGTGTTTTCC TATGATCACG CTCTCCAAAC TATTTGAAGA TTTTCTGTAA 1500 

ATTCATTTTA AACAGAAAGA TAAATAAAAT AGTGAAGAAC CATAGGAATC GTACGTTACG 1560 

TTAATTATTT CCTTTTAGTT CTTAAGTCCT AATTAGGATT CCTTTAAAAG TTGCAACAAT 1620 

CTAATTGTTC ACAAAATGAG TAAAGTTTGA AACAGATTTT' TATACACCAC TTGCATATGT 1680 

TTATCATGGT GATGCATGCT TGTTAGATAA ACTCGATATA ATCAATACAT GCAGATCTTG 174 0 

GCACTTATAG CAGGAGTCCA TTGGTGGCAC ACTGCTAACC CTAGCTGGAC TCACTCTAGC 1800 

CGGTTCGGTG ATCGGCTTGC TAGTCTCCAT ACCCCTCTTC CTCCTCTTCA GTCCGGTGAT 18 60 

AGTCCCGGCG GCTCTCACTA TTGGGCTTGC TGTGACGGGA ATCTTGGCTT CTGGTTTGTT 192 0 

TGGGTTGACG GGTCTGAGCT C 1941 
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What is claimed is: 

1. An isolated nucleic acid. encoding an 
oleosin 5' regulatory region which directs seed- 
specific expression selected from the groups 
consisting of the nucleotide sequence set forth in SEQ 
ID NO: 12, the nucleotide sequence set forth in SEQ ID 
NO: 12 having an insertion, deletion, or substitution 
of one or more nucleotides, or a contiguous fragment 
of the nucleotide sequence set forth in SEQ ID NO: 12. 

2. An expression cassette which comprises 
the oleosin 5 ' regulatory region of Claim 1 operably 
linked to at least one of a nucleic acid encoding a 
heterologous gene or a nucleic acid encoding a 
sequence complementary to a native plant gene, 

3. The expression cassette of Claim 2 
wherein the heterologous gene is at least one of a 
fatty acid synthesis gene or a lipid metabolism gene. 

4. The expression cassette of Claim 3 
wherein the heterologous gene is selected from the 
group consisting of an acetyl -coA carboxylase gene, a 
ketoacyl synthase gene, a malonyl transacylase gene, a 
lipid desaturase gene, an acyl carrier protein (ACP) 
gene, a thioesterase gene, an acetyl transacylase 
gene, or an elongase gene. 

5. The expression cassette of Claim 4 
wherein the lipid desaturase gene is selected from the 
group consisting of a A6 - desaturase gene, a A12- 
desaturase gene, and a A15 - desaturase gene. 

6 . An expression vector which comprises the 
expression cassette of any one of Claims 2-5. 
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7. A cell comprising the expression 
cassette of . any one of Claims 2-5. 

Of Claim e'* --P-ising the expression vector 

. ""^^ ^^^^ Cl^i"" 7 wherein said cell is 

a bacterial cell or a plant cell. 

10. The cell Of Claim 8 Wherein said cell 
xs a bacterial cell or a plant cell. 

11. A transgenic plant comprising the 
expression cassette of any one of Claims 2-5. 

12. A transgenic plant comprising the 
expression vector of Claim 6. 

said o,, ™^ °' °^ " 

saxd Plant .3 at least one o£ a sunflower, soybean 
maize, cotton, tobacco, peanut oil °yoean, 
^ral,idopisis Plant. ""^^ 

16. Progeny o£ the Plant o£ Claim 11 or 12 

17. Seed from the plant of Claim 11 or 12 ' 
13. A method Of producing a plant with 

increased levels of a product of a' fatty acid 

crmpt:::;^"^ °^ ^ ^^^^^ -"^■^-^^^ --h 

(a) transforming a plant cell with an 
Claim 1 operably linked to at least one of an 
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isolated nucleic acid coding for a fatty acid 
synthesis gene or a lipid metabolism gene; and 

(b) regenerating a plant with increased 
levels of the product of said fatty acid synthesis or 
said lipid metabolism gene from said plant cell. 

19. A method of producing a plant with 
increased levels of gamma linolenic acid (GLA) content 
which comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a A6 - desaturase gene; 
and 

(b) regenerating a plant with increased 
levels of GLA from said plant cell. 

20. The method of Claim 19 wherein said A6 - 
desaturase gene is at least one of a cyanobacterial 

A6 - desaturase gene or a Borage A6 - desaturase gene. 

21. The method of any one pf Claims 18-20 
wherein said plant is a sunflower, soybean, maize, 
tobacco, cotton, peanut, oil seed rape or Arabidopsis 
plant. 

22. The method of Claim 18 wherein said 
fatty acid synthesis gene or said lipid metabolism 
gene is at least one of a lipid desaturase, an acyl 
carrier protein (ACP) gene, a thioesterase gene an 
elongase gene, an acetyl transacylase' gene, an acetyl - 
coA carboxylase gene, a ketoacyl synthase gene, or a 
malonyl transacylase gene. 

23 . A method of inducing production of at 
least one of gamma linolenic acid (GLA) or 
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octadecatetraeonic acid (OTA) in a plant deficient or 
lacking in GLA which comprises transforming said plant 
with an expression vector comprising an the isolated 
nucleic acid of Claim 1 operably linked to a A6 - 
desaturase gene and regenerating a plant with 
increased levels of at least one of GLA or OTA. 

24. A method of decreasing production of a 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises: 

(a) transforming a plant cell with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
complementary to a fatty acid synthesis or lipid 
metabolism gene; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 
metabolism gene. 

25. A method of cosuppressing a native 
fatty acid synthesis or lipid metabolism gene in a 
plant which comprises: 

(a) transforming a cell of the plant with an 
expression vector comprising the isolated nucleic acid 
of Claim 1 operably linked to a nucleic acid sequence 
encoding a fatty acid synthesis or lipid metabolism 
gene native to the plant; and 

(b) regenerating a plant with decreased 
production of said fatty acid synthesis or said lipid 
metabolism gene. 
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